Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
ce56032
version 1 of the question and answer
masete Oct 22, 2019
4a0befe
modifying the procfile for heroku not to run clear redis file
masete Oct 23, 2019
4a3e44f
add newleric to follow app performance
masete Oct 23, 2019
af9e330
add line to run clear_redis containing redis url
masete Oct 23, 2019
6e6f566
add virtualenv
masete Oct 25, 2019
ede7066
comment flushdb method
masete Oct 25, 2019
59223f9
uncommenting flushdb in clear_redis.py
masete Oct 26, 2019
059a6f2
changing loglevel in procfile to verbose
masete Oct 26, 2019
593e706
changing loglevel in procfile to INFO
masete Oct 26, 2019
995c2be
add a header in bootstrap
masete Oct 27, 2019
f7c23eb
add head elements
masete Oct 27, 2019
b51ec51
feat(policy-analysis): enables people to compare different policies
masete Oct 27, 2019
567fc58
Add setting to force https
reiosantos Oct 28, 2019
ba8c5c2
Add OPP
masete Nov 8, 2019
678c1ca
add OOP
masete Nov 8, 2019
01afcdb
add opp
masete Nov 8, 2019
ce0d1b8
solving issues
masete Nov 11, 2019
54b528a
fixing
masete Nov 11, 2019
3848098
before refactoring
masete Nov 11, 2019
9d3ef36
add routes file
masete Nov 11, 2019
84b1800
add controllers
masete Nov 11, 2019
1783b7c
refactor
masete Nov 11, 2019
289aead
add clear reddis
masete Nov 12, 2019
5d23a2f
work
masete Nov 13, 2019
8b0e7b4
Delete app.pyc
masete Nov 13, 2019
58eba30
Fix App start up
reiosantos Nov 13, 2019
47beaaa
add urls to the templates
masete Nov 14, 2019
ac71592
add inheriting
masete Nov 14, 2019
fd5b049
add fixes
masete Nov 14, 2019
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
/__pycache__/
venv
../.env
.DS_Store
.vscode
dump.rdb
/.idea/
/myenv
myenv/
/../.idea/
169 changes: 167 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,167 @@
# Machine-Learning-
Machine Learning Question and Answer on pollicy
# Policy Question Answering System

A web application that takes a user's question, surveys open access research articles about a given policy, and returns an answer to the user.

# Project Overview

Every year, millions of research articles on different policies and their consequences are published. Each of these is a rich source of information that can help policy advisors in determining the appropriate policies to implement. This application carries out a number of functions.

1. Provide a chatbot that the user interacts with.
2. Based on the user's answers, generate a question about a given policy consequence.
3. Process the question to obtain keywords and then use these keywords to fetch open access research articles about the policy.
4. Perform claim detection on the abstract of each article to determine whether the article found evidence for or against a given policy consequence.
5. Count the total number of articles that are for or against a given policy consequence.
6. Display this data in a visualization and also provide short summaries of the policy and its consequence.

# Running locally

Make sure that you have Redis, Python 3.6+, pip, and virtualenv installed on your computer.

Clone the git repository and change into the top directory.

```
git clone https://github.com/Prosper21/Policy-Question-Answering-System
cd Policy-Question-Answering-System
```

Create a virtual environment and activate it.

```
virtualenv venv
. venv/bin/activate
```

Install all the required packages by running the command:

```
pip install -r requirements.txt.
```

Create a .env file to store your environment variables. In our case, this would have the following values.

```
SECRET_KEY = 'xxx' # Use your own secret key here
REDIS_URL = 'redis://localhost:6379'
```

We now have all the files and packages that we need to run the application on our local host.
In one terminal, start the redis server by running:

```
redis-server
```

In a second terminal, change into the Policy-Question-Answering-System directory, activate the virtual environment, and start the celery workers by running:

```
celery worker -A answer_policy_question.celery -O fair
```

In a third terminal, change into the Policy-Question-Answering-System directory, activate the virtual environment, and start your application by running:

```
python app.py.
```

You should see the application running and you can access it on localhost:5000.

# Deploying on Heroku

Open an account on Heroku at <https://signup.heroku.com/>.

Install the Heroku CLI on your computer and carry out any other required configurations. The details can be found here <https://devcenter.heroku.com/articles/heroku-cli>.

Clone the git repository and change into the top directory.

```
git clone https://github.com/Prosper21/Policy-Question-Answering-System
cd Policy-Question-Answering-System
```

Create an app on Heroku, which prepares Heroku to receive your source code.

```
heroku create
```

In this case, heroku generates a random name for your app. You can also provide your own name by running:

```
heroku create <app-name>
```

Because this application uses Heroku addons, we need to create those first. In our case, we need the redis addon. We do this by running:

```
heroku addons:create heroku-redis:hobby-dev
```

This will automatically set the REDIS_URL configuration variable for our application but we also need to set up our SECRET_KEY configuration variable. We do this by running:

```
heroku config:set SECRET_KEY = 'xxx' # Use your own secret key here
```

We can now deploy to Heroku by running:

```
git push heroku master
```
Go to the Heroku dashboard and makes sure that your app's web and worker dynos are both switched on.

You can now visit the app at the url generated by its name i.e. herokuapp.app-name.com or simply open the app by running:

```
heroku open
```

# Sills Needed

* Python 3.6+
* JavaScript/HTML/CSS
* Heroku

# Notes

If you make any changes for example by installing new packages and running
```
pip freeze > requirements.txt
```
then you will have to add
```
https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.1.0/en_core_web_sm- 2.1.0.tar.gz#egg=en_core_web_sm
```
to your requirements.txt. This ensures that the required spaCy model is loaded. If this causes a 'Double requirement given' error upon deployment, look for

```
en-core-web-sm=='x.x.x'
```

in your requirements.txt file and delete it.

# References

Bing Liu, Minqing Hu and Junsheng Cheng. "Opinion Observer: Analyzing and Comparing Opinions on the Web." Proceedings of the 14th International World Wide Web conference (WWW-2005), May 10-14, 2005, Chiba, Japan.

# TODO

## Front-End

### Chatbot UI

At the moment, the chatbot assumes that the user provides very specific replies rather than full sentences. For example, when the bot asks 'What policy would you like to research today?', it expects a straight answer like 'cap and trade' rather than 'I would like to research cap and trade'. The same applies to the question 'And what effect of cap and trade on carbon emissions are you interested in?' where we expect an answer like 'carbon emissions' rather than 'I would like to know its effect on carbon emissions'. The reason for this is that the user's replies are being handled using JavaScript in the HTML files that render the page and thus there is no way to apply natural language processing techniques to automatically identify the policy or phenomenon of interest from full sentence replies. I have not yet figured out a way to do the processing of full sentence replies from the backend while maintaining a conversational flow but I believe this can be done.

Another possible improvement is moving all the JavaScript code that is in the HTML files to the JavaScript folder in the static directory. The recommended practice is that JavaScript and HTML code should not me mixed.

## Back-End

### Duplicates

Since we are fetching abstracts from three APIs, there is a chance of the same abstract appearing more than once in our results. It would be a good idea if these duplicates could be removed but some times, an extra character or space in the duplicate essentailly makes the two versions of the abstract different. I think one could use regex to clean up the abstracts and then be able to identify the duplicates.


### Claim detection

At the moment, the claim detection algorithm is still a work in progress and could be improved durther for better results.


10 changes: 10 additions & 0 deletions dar/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
../__init__.py
/__pycache__/
__init__.pyc
run.pyc
config/__init__.pyc
config/config.pyc
.env
/.idea/
config/__pycache__/
views/__pycache__/
2 changes: 2 additions & 0 deletions dar/Procfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
web: python clear_redis.py && python app.py
worker: celery worker -A answer_policy_question.celery -O fair --loglevel=INFO --concurrency=2 --max-tasks-per-child=1
Empty file added dar/__init__.py
Empty file.
Loading