

Discover more from Unruly Sun
arguments are more like dances than warfare.
After stewing in uncertainty for a good couple of weeks, I’ve caught ahold of some threads.
My advisor said to me that often research directions start with a hunch. You get a feeling that something valuable is sitting just beyond horizon; you can’t articulate whether they’re glimmering gems or Sanskrit scrolls, and you don’t even know the exact route you’re going to take. All you have is the light feeling of thread leading westward.
When I explain my project to someone, I’m met with one of two reactions. They either get it very quickly and want to hear about what I’ve found, or else don’t understand the point of my inquiry and not much I say moves them. Why open source machine learning? Why does it matter? It feels so obvious to me, but it’s hard to convey, especially to someone who’s not a practitioner. My feeling is strong, but the details are vague. I see the outline of a treasure but it’s still barely perceptible.
I think it’s time for me and make a quantitative case for caring about open source machine learning. A case that others can build on which will convince both the congressional staffer and the innovation economist.
Writing
I spent quite a bit of time laying out a narrative for the presentation I gave to my lab on Wednesday. The narrative needs tightening, but I’m beginning to learn how to tell a better story around my research.
Notes
Gitee is China’s equivalent of Github. Perhaps unsurprisingly, the number of users is far smaller. Paddlepaddle, for instance, one of the most popular Chinese packages (for distributed deep learning), has 3k stars on Gitee versus 16k stars on Github. What are the other differences between Github and Gitee? (cultural, censorship, range of content?)
Gunther Weil (student of Leary) gave a fascinating lecture with a great set of references. To follow up on: Ivan Illich’s Convivial Tools, Buckminster Fuller’s spiritual inspiration, David Bohm’s implicate and explicate order (also to finish his ‘On Dialogue’), I Ching, the Seventh Generation principles at the Iroquois Confederacy, strange attractors, Nikola Tesla’s autobiography.
External Links
Minksy on Neats versus Scruffies in classical AI research (wiki). Cathedrals versus Bazaars in open source development. Worse-is-better versus the right thing for C versus Lisp.
Do you know of any other helpful dichotomies in thinking about programming languages/software development more generally?
Book Review: Barriers to Bioweapons. I’m reading a book about the Russian bioweapon program for a book club this month. This excellent review provides some relieving counterweight to the fear that Biohazard stirs up.
One interesting takeaway is that covertness has a substantial cost – forcing a program to “go underground” is a huge impediment to progress. This suggests that the Biological Weapons Convention, which has been criticized for being toothless and lacking provisions for enforcement, is actually already doing very useful work – by forcing programs to be covert at all. Of course, Ouagrham-Gormley recommends adding those provisions anyways, as well as checks on signatory nations – like random inspections – that more effectively add to the cost of maintaining secrecy for any potential efforts. I agree.
Books
Metaphors We Live By, Lakoff and Johnsen. One of the books that is changing the way I perceive language and thought. They argue that metaphors are not simply tools for expression, but are rather intrinsic to the way we percieve the world. Our reference to arguments as war ("he won the debate", "your claims are indefensible"), for instance, foreground disagreements as inherently antagonistic. The implications are really significant. For a nice overview, read vgr’s blog post about it.
Planning
Review
From past plan:
Create presentation for lab meeting (done - this was a success)
Theoretical sampling of repos.
Make a grouping of similar papers in lit review
Generate a few different potential methodologies
Sit with uncertainty (still sitting with this)
Extra things I did:
Met with some programming language people (very helpful for contextualizing python’s role in the ML space)
Met with someone from the Ostrom workshop, who helpfully pointed me towards the literature around bioinformatics commons
Next Two Weeks
Update my research timeline (Q4 plan)
Create document outlining an approach to estimating value in the OSML space. (Panel data approach, approach using fractions of existing estimates of OSS value?)
Start reading through version updates of the packages I am interested in
Keep up weekly conversations with range of people
Reach out to Weitzner