Introduction to Logistic Regression


Presenter: David Lewis

David D. Lewis is an independent consultant working in the areas of information retrieval, machine learning, and natural language processing. He has worked with startups, large corporations, and governmental organizations on the design, implementation, acquisition, and fielding of systems for manipulating and mining text data. He has helped write research grants receiving more than US $5 million in funding, has published more than forty scientific papers, and holds six patents. He has been a member of committees for the U.S. government MUC and TREC evaluations of language processing technologies. He holds a Ph.D. in Computer Science from the University of Massachusetts at Amherst. His Ph.D. dissertation won the 1992 American Society for Information Science Doctoral Forum Award.


This tutorial will present a broad ranging introduction to logistic regression, a flexible, effective approach to supervised learning of classifiers. The emphasis is on diversity of perspectives. Logistic regression has been discovered and rediscovered under a range of names and notations, in a variety of fields. The tutorial will attempt to present the best of the insights from all fields.The tutorial also indirectly serves as an overview of a number of a important concepts in modern machine learning, including loss functions, regularization, model misspecification, optimization algorithms, and the extent to which effectiveness of trained models can be predicted. Issues relevant to information retrieval, such as high dimensionality, sparse data, noise, and their implications will be discussed.

Where possible the presentation will emphasize qualitative and graphical presentations of information rather than equations. However, some mathematical sophistication and prior exposure to supervised learning is desirable for tutorial attendees. Computational examples using publicly available software and datasets will be presented. SIGIR 2005 will be the first time this tutorial has been presented.

back to Tutorials' page

Copyright 2004, ACM. All rights reserved.