Planning for future research I used the open source code in the Twitter database server module to begin storing tweets related to #NCDs from the Twitter Streaming API and began saving them in my personal database. The database contains separate tables for tweets, tweet_tags, tweet_urls, tweet_mentions, and users.
To download and store all related #NCDs tweets you must connect to the Twitter API server, and maintain the connection permanently, with tweets being received in real-time, and reestablish a connection if there is a network error. The complete list of search terms is below. I have already collected 100,000 #NCDs related tweets and was looking to collect 1 million by sometime next week. With that data, what types of questions could we possibly answer?
List of Search Terms: 'NCDs','#NCDs','#LIVESTRONG','#UNSummit','#tobacco','#NCDChild','@ncdaction','@NCDs_PAHO','#Diabetes','@ncdalliance','@HealthCaribbean', '#cancer','#publichealth','#smoking','@HealthCaribbean', '@globalhealthorg'.
Here is what accidentally downloading 10,000 tweets about Justin Bieber looks like: http://bit.ly/qT3EWV