The objective of this project is to show what we learned during the fifth quarter in the subject BigData, as a project we developed a pipeline by which we mine data from Twitter through the library Twint and ingest them into a topic of kafka Confluent, and then enrich the data through python as well as ingest them into a new topic for indexing in ElasticSearch using as an intermediary Logstash and as a final component we would use Kibana for the visualization of data.
To facilitate the installation we decided to create a bash-programmed installer to speed up the installation and deployment on the nodes or clusters.
git clone https://github.com/DanielDCM212/BigDataTwitter.git
cd BigDataTwitter
bash InstallHub.sh
bash StartService.sh
python3 ListeningKafka.py
python3 Enriqueser.py
