Sunday, April 22, 2018

a dumb algorithm with lots and lots of data beats a clever one with modest amounts of it

--- Pedro Domingos, professor of computer science at UW Seattle, in "A few useful things to know about machine learning" (2012), Comm. of the ACM.

Quote in context:
Suppose you have constructed the best set of features you can, but the classifiers you receive are still not accurate enough. What can you do now? There are two main choices: design a better learning algorithm, or gather more data (more examples, and possibly more raw features, subject to the curse of dimensionality). Machine learning researchers are mainly concerned with the former, but pragmatically the quickest path to success is often to just get more data. As a rule of thumb, a dumb algorithm with lots and lots of data beats a clever one with modest amounts of it. (After all, machine learning is all about letting data do the heavy lifting.)
The immediately following paragraph also has some good stuff:
This does bring up another problem, however: scalability. In most of computer science, the two main limited resources are time and memory. In machine learning, there is a third one: training data. Which one is the bottleneck has changed from decade to decade. In the 1980s it tended to be data. Today it is often time. Enormous mountains of data are available, but there is not enough time to process it, so it goes unused. This leads to a paradox: even though in principle more data means that more complex classifiers can be learned, in practice simpler classifiers wind up being used, because complex ones take too long to learn. Part of the answer is to come up with fast ways to learn complex classifiers, and indeed there has been remarkable progress in this direction (for example, Hulten and Domingos).

Thursday, April 19, 2018

Most people who’d love to be novelists don’t write novels, and that’s because they’re not really interested in doing so

--- Paul J. Griffiths, Warren Professor of Catholic Theology at Duke Divinity School, in "Letter to an Aspiring Intellectual: Outlines of the Life of the Mind," First Things, May 2018

Quote in context:
From your letter, and especially from your list of people you like to read, I think that at the moment you’re in love with the idea of being an intellectual rather than with some topic for thought. You’d like to be the kind of person who writes books like Regarding the Pain of Others or the Lam-rim chen-mo, rather than being already deeply enmeshed in the toils of thought about some particular topic. This may be a sign that you’re not yet serious, that, as Augustine said of himself in his salad days, you’re in love with love rather than simply in love. Most people who’d love to be novelists don’t write novels, and that’s because they’re not really interested in doing so. They’re infatuated with an image and a rôle rather than with what those who play that rôle do. So, perhaps, with you; if so, the infatuation will fade as you grow older, and you’ll do something closer to the rough ground of material necessity.
Some other gems:
"So: Find something to think about that seems to you to have complexity sufficient for long work, sufficient to yield multifaceted and refractory results when held up to thought’s light as jewelers hold gemstones up to their loupes. And then, don’t stop thinking about it."
"You need a life in which you can spend a minimum of three uninterrupted hours every day, excepting sabbaths and occasional vacations, on your intellectual work. ... You need this because intellectual work is, typically, cumulative and has momentum."
"The most essential skill is surprisingly hard to come by. That skill is attention. Intellectuals always think about something, and that means they need to know how to attend to what they’re thinking about. Attention can be thought of as a long, slow, surprised gaze at whatever it is."
"Don’t do any of the things I’ve recommended unless it seems to you that you must. ... Undertake it if, and only if, nothing else seems possible."

Tuesday, April 17, 2018

Most exciting ideas are not important, most important ideas are not exciting, not every problem has a good solution, and every solution has side effects.

--- Dan Geer, Testimony to U. S. House of Representatives Committee on Science Subcommittee on Technology Washington, DC, 11 February 1997

I am reminded of what I know as the four verities of government:

  • Most exciting ideas are not important,
  • Most important ideas are not exciting,
  • Not every problem has a good solution, and
  • Every solution has side effects.