Skip to Content.
Sympa Menu

nl-uiuc - [nl-uiuc] Reminder (20min): AIIS: Miles Osborne -- Cross-stream event detection

nl-uiuc AT lists.cs.illinois.edu

Subject: Natural language research announcements

List archive

[nl-uiuc] Reminder (20min): AIIS: Miles Osborne -- Cross-stream event detection


Chronological Thread 
  • From: Yonatan Bisk <bisk1 AT illinois.edu>
  • To: nl-uiuc <nl-uiuc AT cs.uiuc.edu>, AIVR <aivr AT cs.uiuc.edu>, Vision List <vision AT cs.uiuc.edu>, aiis AT cs.uiuc.edu, aistudents AT cs.uiuc.edu, "Girju, Corina R" <girju AT illinois.edu>, Catherine Blake <clblake AT illinois.edu>, "Efron, Miles James" <mefron AT illinois.edu>, "Lee, Soo Min" <lee203 AT illinois.edu>, Jana Diesner <jdiesner AT illinois.edu>, "Raginsky, Maxim" <maxim AT illinois.edu>
  • Subject: [nl-uiuc] Reminder (20min): AIIS: Miles Osborne -- Cross-stream event detection
  • Date: Fri, 4 Apr 2014 13:38:43 -0500
  • List-archive: <http://lists.cs.uiuc.edu/pipermail/nl-uiuc/>
  • List-id: Natural language research announcements <nl-uiuc.cs.uiuc.edu>

When: Today @ 2pm
Where: 3405 SC
Speaker: Miles Osborne


Cross-stream event detection
*************************************
Social Media (especially Twitter) is widely seen as a source of real-time breaking news.  For example, when Osama Bin Laden was killed by US forces the news was first made public on Twitter. Rapidly finding all breaking news has clear economic and humanitarian benefits.

Finding all such breaking news presents hard computational challenges.  We need to detect news-related novelty in massive streams (upwards of two thousand posts per second)  as quickly as possible.  Efficiency is not the only consideration however and we also need to confront the enormous quantity of irrelevant posts.  In this talk I will outline how we tackle the first problem using Locality Sensitive Hashing, taking constant time per post.  In tandem I will mention how we use Storm to parallelise this computation, yielding a system capable of processing 2k tweets per second.  The second problem is tackled by intersecting the Twitter stream with Wikipedia page requests, filtering-out spurious first stories.  Taken together, this results in processing more than 250 million items per day.  Finally I will consider the question of whether Twitter really does lead Newswire for breaking news.

Joint work with Sasa Petrovic (Edinburgh), Craig MacDonald (Glasgow), Iadh Ounis (Glasgow) and Richard McCreadie (Glasgow)

Biography
*********
Miles Osborne is a Reader in Informatics at Edinburgh, with research interests in Machine Translation, Social Media and large scale processing of natural language.  He received his PhD from the University of York in 1994 and had travelled the land, carrying-out Post Docs at Cambridge and Groningen prior to being in Edinburgh.  He spent a sabbatical at Google in 2006 working within their Machine Translation group and for 2013 -- 2014 is spending a sabbatical at the Johns Hopkins.


  • [nl-uiuc] Reminder (20min): AIIS: Miles Osborne -- Cross-stream event detection, Yonatan Bisk, 04/04/2014

Archive powered by MHonArc 2.6.16.

Top of Page