Practical Introduction to Machine Learning
Event Information
Description
RadialPoint is hosting another community tech talk, and this time we're bringing in Pablo Duboue, who will be giving a small introduction to the very vast field of Machine Learning. The focus will be on practice rather than theory.
Abstract
I will present the main areas of the field (supervised learning, unsupervised learning, semi-supervised and reinforcement learning) and present examples and lessons learned after 14 years of applying Machine Learning to Natural Language Processing and Information Retrieval.
I will show examples using the following tool-kits:
- Weka 3 (http://www.cs.waikato.ac.nz/ml/weka/)
- Apache Mahout (http://mahout.apache.org/)
- Apache OpenNLP MaxEnt (http://maxent.sourceforge.net/about.html)
The examples will come from my previous work, in fields ranging from text mining on molecular biology [1], question answering [2], and even recent work I did for MatchFWD on clustering companies by employee migrations.
[1] Disambiguating proteins, genes, and RNA in text: a machine learning approach, V Hatzivassiloglou, PA Duboue, A Rzhetsky (2001), Bioinformatics 17 (suppl 1), S97-S106
[2] A framework for merging and ranking of answers in DeepQA, DC Gondek, A Lally, A Kalyanpur, JW Murdock, PA Duboue, L Zhang, Y Pan, ZM Qiu, C Welty (2012), IBM Journal of Research and Development.
The objective of this presentation is to demystify Machine Learning and to encourage the audience to incorporate Machine Learning tools and approaches into their everyday work. I will also discuss problems I have identified regarding user support for systems incorporating trained models.
About the speaker
Dr. Duboue is an independent language technologist. His work focuses on applied language technology and natural language generation. He received a Licenciatura en Computacion degree from Cordoba University (Argentina) in 1998 and M.S., M.Phil and Ph.D. degrees in Computer Science from Columbia University in the City of New York in 2001, 2003 and 2005 (dissertation title: "Indirect supervised learning of strategic generation logic"). He is passionate about improving society through language technology and splits his time between teaching, doing research and contributing to free software projects. He has taught at Cordoba University, Columbia University, Siglo21 University and has worked for IBM TJ Watson Research as a Research Staff Member. At IBM Research he helped create the Watson (TM) Question Answering system that participated in the Jeopardy! (TM) Grand Challenge.