loklak

Join the chat at https://gitter.im/loklak/loklak Docker Pulls Percentage of issues still open Average time to resolve an issue Twitter Twitter Follow

Development: Build Status Docker Build Status Master: Build Status Master

Dependencies: Gemnasium

loklak is a server application which is able to collect messages from various sources, including twitter. The server contains a search index and a peer-to-peer index sharing interface. All messages are stored in an elasticsearch index. An automatic deployment from the development branch at GitHub is available for tests here https://loklak-server-dev.herokuapp.com

‘Lok Lak’ is also a very tasty Cambodian stir-fry meat dish (usually beef) with a LOT of fresh black pepper. If you ever have the chance to eat Beef Lok Lak, please try it. I hope not to scare vegetarians with this name, currently I am one as well.

Communication

Please join our mailing list to discuss questions regarding the project: https://groups.google.com/forum/#!forum/opntec-dev

Our chat channel is on gitter here: https://gitter.im/loklak/loklak

Why should I use loklak?

If you like to be anonymous when searching things, want to archive tweets or messages about specific topics and if you are looking for a tool to create statistics about tweet topics, then you may consider loklak. With loklak you can:

  • collect and store a very, very large amount of tweets
  • create your own search engine for tweets
  • omit authentication enforcement for API requests on twitter
  • share tweets and tweet archives with other loklak users
  • search anonymously on your own search portal
  • create your own tweet search portal or statistical evaluations
  • use Kibana to analyze large amounts of tweets for statistical data.

We Capture Messages With Distributed Scrapers

If you want to create an alternative Twitter search portal, the only way would be to use the official Twitter API to retrieve Tweets. But that interface needs an OAuth account and it makes your search portal completely dependent on Twitter’s goodwill. The alternative is, to scrape the Tweets from the Twitter HTML search result pages, but Twitter may still lock you out on your IP address. To circumvent this, you need many clients accessing Twitter to scrape search results. This makes it necessary to create a distributed peer-to-peer network of Twitter scrapers which can all organize, store and index Tweets. This solution was created with Loklak.

Best of all: we made this very generic to integrate different microblogging services, so this may be the incubator for an independent short message or Twitter-like platform.

Not a Search Portal

Search portals consist of many components, the most prominent parts are content harvesters to acquire searchable content, a search index which provides fast and efficient access to the data and a search front-end containing the user webpages and result display servlets:

_images/concept_searchengine.png

Most search portals differ in the way how they display search results but have the almost same back-end to create the search index. We want to support the creation of message/Twitter search portals but the necessary and most generic part needs to be coded only once, even if we want several or even many different search front-ends:

_images/concept_messagesearch.png

So it’s on you to create a message search portal, but the very hard part for this was already done by us. However, the front-end may also instantly be there (i.e. you can just use Kibana).

Collect Messages

Collected messages are processed to two storage targets: an elasticsearch search index and a backup- and transfer dump.

_images/concept_messagesearchdetail.png

Distributed, Peer-to-Peer

loklak instances can be connected to each other. If you download loklak and run it unchanged, it connects to loklak.org by default as a back-end peer. You can change this if you want to. This is how connected peers work:

  • Whenever a peer aquires new Tweets, it reports these to the back-end for storage. This means that fresh Tweets are stored at four locations: your own Elasticsearch index, your own message dump file and in the remote back-end-peer in the Elasticsearch index and dump file. This causes that all messages that you find are available for download for other users at the back-end which is by default loklak.org.

    _images/component_jsonlistp2p.png
  • Whenever a peer starts up, it calls the back-end to announce it’s existence. This fills a peer-table in the back-end which everyone can use to retrieve the list of active peers. Therefore everyone can identify peers which may provide message-lists for download.

  • Any topology can be achieved when the user changes the host name of the back-end. You can create your own message-sharing network easily.

How do I install loklak: Download, Build, Run

Note

You must be logged in to Docker Cloud for the button to work correctly. If you are not logged in, you’ll see a 404 error instead.


Deploy Deploy on Scalingo Deploy to Bluemix Deploy to Docker Cloud

At this time, loklak is not provided in compiled form, you easily build it yourself. It’s not difficult and done in one minute! The source code is hosted at https://github.com/loklak/loklak_server, you can download it and run loklak with:

> git clone https://github.com/loklak/loklak_server.git
> cd loklak_server
> ant
> bin/start.sh

After all server processes are running, loklak tries to open a browser page itself. If that does not happen, just open http://localhost:9000; if you made the installation on a headless or remote server, then replace ‘localhost’ with your server name.

To stop loklak, run: (this will block until the server has actually terminated)

> bin/stop.sh

A self-upgrading process is available which must be triggered by a shell command. Just run:

> bin/upgrade.sh

Where can I download ready-built releases of loklak?

Nowhere, you must clone the git repository of loklak and build it yourself. That’s easy, just do

  • git clone https://github.com/loklak/loklak_server.git
  • cd loklak
  • then see above (“How do I run loklak”)

How do I install loklak with Docker?

To install loklak with Docker please refer to the loklak Docker installation readme.

How do I deploy loklak with Heroku?

You can easily deploy to Heroku by clicking the Deploy to Heroku button above. To install loklak using Heroku Toolbelt, please refer to the loklak Heroku installation readme.

How do I deploy loklak with cloud9?

To install loklak with cloud9 please refer to the loklak cloud9 installation readme.

How do I deploy loklak on Google Cloud with Kubernetes?

To install loklak on Google Cloud with Kubernetes, please refer to the loklak Google Cloud with Kubernetes installation.

How do I setup loklak on Eclipse?

To install loklak on Eclipse, please refer to the loklak Eclipse readme.

How do I run loklak?

  • build loklak (you need to do this only once, see above)
  • run bin/start.sh
  • open http://localhost:9000 in your browser
  • to shut down loklak, run bin/stop.sh

How do I analyze data acquired by loklak

loklak stores data into an elasticsearch index. There is a front-end
for the index available in elasticsearch-head. To install this, do:
  • sudo npm install -g grunt-cli
  • cd into the parent directly of loklak_server
  • git clone git://github.com/mobz/elasticsearch-head.git
  • cd elasticsearch-head
  • npm install

Run elasticsearch-head with:

  • grunt server ..which opens the administration page at http://localhost:9100

How do I configure loklak?

The basis configuration file is in conf/config.properties. To customize these settings place a file customized_config.properties to the path data/settings/

Where can I find documentation?

The is available at http://dev.loklak.org.

Where can I find showcases and tutorials?

Articles and tutorials are also on our blog at http://blog.loklak.net.

Where do I find the java documentation?

At http://dev.loklak.org/javadoc or by building them via ‘ant javadoc’.

Where can I get the latest news about loklak?

Hey, this is the tool for that! Just put http://api.loklak.org/api/search.rss?q=%23loklak into your rss reader. Oh wait.. you will get a lot of information about tasty Cambodian food with that as well. Alternatively you may also read the authors timeline using http://api.loklak.org/api/search.rss?q=0rb1t3r or just follow @0rb1t3r (that’s a zero after the at sign)

How to compile loklak using Gradle?

  • To install Gradle on Ubuntu:
$ sudo add-apt-repository ppa:cwchien/gradle
$ sudo apt-get update
$ sudo apt-get install gradle
  • To install Gradle on Mac OS X with homebrew

brew install gradle

Compile the source to classes and a jar file

gradle build

Compiled file can be found in build dir

To remove compiled classes and jar file

gradle clean

What is the software license?

LGPL 2.1

Where can I report bugs and make feature requests?

This project is considered a community work. The development crew consist of YOU too. I am very thankful for pull request. So if you discovered that something can be enhanced, please do it yourself and make a pull request. If you find a bug, please try to fix it. If you report a bug to me I will possibly consider it but at the very end of a giant, always growing heap of work. The best chance for you to get things done is to try it yourself. Our issue tracker is here.

Have fun!
@0rb1t3r