Yesterday I outlined a strategy of rapid learning: the core is doing many problems with feedback. When you need supplemental information, you can do either a quick lookup or refer to lectures or a book. I can't tell you how the strategy worked for me just yet! My interview was rescheduled.
How can you apply this strategy to your own learning? The problem is, well, you need problems. Preferably lots of them, well sequenced, with feedback. Books are the best bet, but there is an emerging class of web apps providing problems and feedback. I've bolded the ones that let you do everything in the browser, including verification of your answers.
Code.org has good coverage of interactive beginner tools. To quickly learn a new language, Code School and Codecademy cover a variety and here are some options for Ruby (or here), Python, Java, Lisp, Prolog, OCaml, Clojure, ClojureScript. Not web-based but here's a list of repositories for koans in many languages, and a mixed variety of resources. For practice with algorithms and other such challenges, use Project Euler (more math-oriented) or HackerRank (topics from algorithms to AI and data). Kaggle is specifically for machine learning and ranges from simple data sets to paid competitions, where even professionals can hone skills. For more introductory data tools, there's DataCamp.
I've accepted an offer after more than two months of dedicated job hunting.
I submitted a large number of applications. Quite a few were still in progress when I accepted, but of the 27 that either went through the process or were ignored for long enough to consider dead, the results were
Today's (very first) session is Machine Learning for Hackers (ML4H), chapter 3.
Machine learning is on the top of my list of skills, and I'm learning it in Python, which has ample libraries and community support and a really clean and easy syntax. ML4H is written for R, but there's already a blog series and GitHub repo porting the code to Python.
I'm already familiar with Naive Bayes classification, the approach used here, but I haven't used the Python-based NLTK library. In fact, I recently rolled my own text classification setup in Python and Scikit-Learn's Naive Bayes, so I'm paying particular attention to how the code is structured and what NLTK provides.
The code uses the following methods:
I've been doing interviews in Java for whatever reason, but since I'm focusing on Python for data, I might as well be doing my code puzzles in Python.
While practicing, I prefer having working solutions available. That's what I did a few days ago with OCaml here: ocaml.org/learn/tutorials/99problems.html. Having the solutions makes it easier to give up, but it's better to give up and see a solution than to just skip to the next problem.
Also nice is that the OCaml problems come with test cases. OCaml and Python both have an assert method to make testing your code super clean and easy.
Tonight I redid three questions from my Facebook interviews. Sorry, not sharing those.
I just did my first interview in Python and it turns out I made the switch from Java prematurely. Disaster. Here are some things I didn't know.
Constructor overloading. Function overloading is when a function with the same name is defined in two different ways and distinguished by its parameters. In particular, constructor overloading would allow you to instantiate a class in two different ways. Say a rational number that is defined either by two ints as a/b or by one float. Does Python have function overloading? I didn't know.
It turns out Python is quite flexible in its parameters and relies on that flexibility rather than method overloading. So for the rational number example, you could either have two optional parameters and check them, or you could have a dictionary datatype where you check.
Throwing exceptions. The interviewer was insistent that my class constructor conform to the two constructor options, so my suggestion was to check the parameters and throw an exception when it didn't work. Of course, I didn't know how. The keyword in Python is raise, and it can either be called alone or with an exception object (which can be user-defined). See more at http://docs.python.org/2/tutorial/errors.html
Identity testing. Later, I wanted to test if two objects were the same object in memory. I suggested == but it turns out this does a value equality test. For a custom class, you need to define an __eq__ function to determine equality and without that == is true only if they are the same object (though I'm not completely sure about this). So technically I think it worked. But the right way to test for the same object in memory is Python's is keyword:
As mentioned, I wanted to quickly advance through Python practice problems and be able to see working solutions. The best method I was able to find is HackerRank. You can access previous contests, do coding and testing in many languages right in the browser, and then see others' code on the leaderboard, so it's great to see a variety of solutions and languages.
However, I'm starting to get diminishing returns from doing coding puzzles. Better to read a book on best practices, especially object oriented. Here are some more tidbits from today:
Pass by assignment. Every language seems to have a slightly different approach to passing parameters to a method. Python's is called pass by assignment. The best explanation I saw of this is that every assignment in Python is binding a name to an object. x = 1 means that x refers to the 1 object of type int. Every integer 1 is the same object. So here, when passing an integer, b is, within the function, being bound to a different object, 3, and a is not affected.
However when you pass a list, the list that a and b point to is being modified, . With this understanding, there's no need for concepts of pass-by-value, pass-by-reference, or mutable/immutable.
Hi all explorers of my blog,
What are you learning? Why? And what can I share that would be helpful to you? Let me know!
My second interview in Python went better. I wanted to keep a data structure of keys and values where I'd repeatedly retrieve and remove the minimum key and then insert a new value. This is a priority queue of fixed size k.
I wasn't sure what Python offered natively, so I went with a less efficient method of repeatedly sorting a dictionary by keys. I correctly guessed the syntax (the interviewer was lenient when I said I wasn't sure):
Note that this returns a list of keys, so the value for the minimum key is d[sorted(d)]. Here's an option for sorting by value, which also returns a list of keys:
I've been applying to a few quantitative trading firms. My interviews so far have been coding and math puzzles, but I'm told that tomorrow's will involve mathematical finance. I have absolutely no background in that. What to do in a few hours of studying?
In college, I always felt there was something off about sitting through long lectures and taking methodical notes that were never referred back to, at least more than once before the exam. My solution then was to use a spaced repetition system (SRS) to make all the pertinent facts sticks. I talk a lot about spaced repetition systems--and their shortcomings--on my other blog. In this case though, the main issue is that I don't have time to create material for SRS anyway.
My strategy has been a mix of the following:
Is this chaotic flurry of learning good over a longer term? Well, I don't have time to answer tonight! But I will say this: I've felt like I'm doing better on math challenges than coding/algorithm ones. I've studied and worked with programming much more in the last few years, but it doesn't seem to compare with the massive amounts of quick, targeted math problems I did in high school--probably 100-200 problems per week. Those were mostly more basic than math olympiad problems, and I'd say the reason I didn't do better than I did in math competition is that I didn't do much of the less satisfying work of reading solutions for hard problems just outside of my range.