Wednesday, February 21, 2018

Hierarchical Hierarchical Clustering: A Concept So Nice, We Named It Twice

While working at Resonai, I wrote a piece of code that performs Hierarchical Clustering, in collaboration with David Lehavi. In addition to various optimizations I won't get into, we applied a nice heuristic that allowed a considerable improvement in the program's memory footprint, as well as the running time. The more formal name we gave it was Spatially Sensitive Hierarchical Clustering (SSHC), but ended up referring to it as Hierarchical Hierarchical Clustering which is funnier, and better reflects what's really going on.

Hierarchical Clustering in a Nutshell

The following image, taken from the Wikipedia article on HC, illustrates the basic notion rather well:


Suppose we have 6 items that we wish to cluster, and we have some metric that we can compute between subsets of those items. We get a hierarchy of clusters via the following greedy algorithm:
  1. Let each item be a cluster (with a single element)
  2. Let $S$ be the set of all clusters
  3. While $|S| > 1$:
    1. Find the closest pair of clusters $X, Y$ in $S$.
    2. Define a new cluster $Z := X \cup Y$
    3. Add $Z$ to $S$, remove $X$ and $Y$ from $S$
So the diagram above implies the following possible sequence of cluster unions (it may be slightly different):
  1. $\{b\} \cup \{c\}$
  2. $\{d\} \cup \{e\}$
  3. $\{d,e\} \cup \{f\}$
  4. $\{b,c\} \cup \{d,e,f\}$
  5. $\{a\} \cup \{b,c,d,e,f\}$
There are a number of obvious optimizations, for example, one does not have to recompute all distances after each creation of a new cluster. Rather - only the distances that involve the new cluster.

What is It Good For?

In most of the use cases that we employed, the initial atoms were triangular facets of a 3D mesh. One such use case is mesh segmentation, that is the process of breaking down a 3D object into meaningful sub-parts. There's a well-known paper from 2006 by Attene et al that describes such an approach. The metric chosen for distance between segments(=clusters) is how much they resemble certain primitive shapes  the authors chose in advance (cylinder, cube, sphere, etc.).
As can be seen in this image taken from the paper, once one has the full hierarchy of segments, this tree can be trimmed according to the number of clusters one wishes.
source: "Hierarchical mesh segmentation based on fitting primitives" by Attene et al

The Problem With HC

The natural approach we took was to consider this not as atoms where all pairwise distances needed to be considered, but as a graph, where only distances between neighboring vertices had to be considered. In the 3D mesh case - adjacent faces were represented by neighboring atoms in the HC tree. So now we simply put all the edges of the graph (and the respective distances) into a some implementation of a min-heap, and began the simple process of:
  • Extracting the minimal edge
  • Uniting the clusters it connected
  • Updating the graph (and the edges in the heap) accordingly
This became an issue when the number of items stored in the heap was in the millions and tens of millions, which is very much the case when you get a high-quality 3D-scan of a room, for example. It turned out that operations on the heap became unbearably expensive, and the memory footprint was terrible, since we had to store this huge heap in the RAM.

The Solution: HHC

We realized that changes in the HC tree were almost always very local - one computes the distance between pairs of clusters, which are neighboring nodes in the graph, and sometimes unites them in a way which affects only the neighbors of the united clusters. So why not divide and conquer?
What we ended up doing is this:
  • Break down the model very crudely into K parts, by doing something that is not far from just putting it on a grid and taking cubes. Choose K such that each part will have not so many facets in it, but not too little.
  • Run HC on each part separately until the number of clusters in the part becomes small.
  • Now unite adjacent parts until the number of clusters in each part is again not too big but not too small.
Note that this is in effect doing a Hierarchical Clustering of the parts, hence the winning name.
Also note that it effectively means you work on a number of very small heaps all the time and never on a very large one. This means heap operations are now considerably cheaper, and indeed the memory footprint went down by a factor of 5 or so, and the running time was improved dramatically, on machines with slower hard drives (since less swapping was involved).

The Price: Accuracy

As it often happens, the price of heuristics is that they mess up the algorithm's accuracy. In our case - it means that if the minimal edge happens to connect atoms that lie in different parts - you will get to it much later than you otherwise would have, but the reduction in running time and resource consumption made it worth our while.

Thursday, December 7, 2017

Fun at Work: What to Do When a Colleague Tells a Silly Joke

So, your colleague told a bad joke, and the only proper response is a rim shot sound. However, you have no drum kit within reach, so what you really need is a keyboard shortcut, shall we say Ctrl+Alt+Shift+R to play the following rim shot sound effect.





So, here's how you do it in Ubuntu:


  1. Download the above effect. Suppose you saved it to ~/rimshot.mp3
  2. Install some shell tool that plays audio files. I use play from the sox package.
    sudo apt-get install sox
    sudo apt-get install sox libsox-fmt-all
  3. Go to System Settings -> Keyboard -> Shortcuts -> Custom Shortcuts
  4. Click on the '+' sign to add a new one:
  5. Pick a name and have it run the command 'play' on your file:
  6. Click on the word "Disabled at the end of the line of your newly created shortcut, press the key combination you like (e.g. Ctrl+Alt+Shift+R), and your done.
  7. Wait for a colleague to tell a silly joke.
  8. Hit Ctrl+Alt+Shift+R.
  9. Go out on a hight note.





Thursday, November 23, 2017

How to Utilize Computing Power That is Now "Wasted" in Blockchain

As of the time I'm writing this, the Bitcoin network consumes more electricity than 159 of the countries in the world, as pointed out here.
To be clear - all this electricity is wasted by computers trying to solve (via brute-force) some computational problem. The first computer to solve it - is rewarded by the network, transactions are validated, another block is created, and everyone move on to the next problem. Another day on the blockchain.

And here's the somewhat weird part - nobody cares about the problems or the solutions. The only reason they are of interest, is that we can estimate the difficulty of solving them. So a huge amount of energy and resources is wasted on solving problems no one cares about.


Why Not Solve Important Problems Instead ?

I was thinking about this a few weeks ago when I thought - what if, instead of all this waste, those computers tried to solve hard problems that people do care about?
The famous class of NP-complete problems has a few problems in it that are real-world problems (other complexity classes make sense too, of course), for example - the traveling salesman problem (aka TSP). What if people who had TSP instances, because they are trying to navigate somewhere, or choose a route for some robot running erands in a large warehouse, could submit them to the blockchain and have them solved for them, using all this computing power?
There is an abundance of hard real-world computational problems, many of which arise in Chemistry and Biology. I understand very little about it, but I know researchers who actually send data to be processed for several days on a super-computer, just to get results for their research.
What if an entire blockchain network was, in effect, a crowd-sourcing project to solve hard computational problems? Wouldn't it be nice to put all this computing power to good use? Changing the existing Bitcoin protocol is probably out of the question at this point, but maybe a new crypto-coin, dedicated for that purpose? Maybe a different crypto-coin for different types of computational problems, the options are many, and seem interesting, at least in theory.
תוצאת תמונה עבור ‪computation stock photo‬‏
An illustration Google image search came up with for "hard computation"

The Nice Thing About This Idea

There are a few appealing notions here:

  1. Hard problems are already sorted into complexity classes. This means that hard problems of the same class are convertible to one another. We can decide that the network actually solves 3SAT instances, and reduce any TSP instance to a 3SAT, and then submit it to the network.
  2. It means hard problems actually have economic value for their hardness, where in the past it was only a cost associated with solving it. Now - solving the problem will oddly be its own reward... which is kind of neat.
  3. Maybe when submitting a problem, one can offer a reward for solving it.

Some of The Problems

While being a nice idea, it has tons and tons of problems. Here are just a few
  1. One can game the system - submit a problem, already knowing its solution, and collect the reward by solving it. This one is probably not hard to mitigate by scrambling the problem while converting it to the format readable by the network.
  2. A harder issue is how do we know an instance is hard at all to begin with? There are SAT solvers in the world exactly because there are many heuristics that cover some types of instances of the problem.
  3. Do people really need exact solvers? At least for the case of NP problems - for many of them an approximated solution is good enough, so is this really filling a need that people actually have? I don't know.

There are probably many other issues with this idea, but I thought it would be nice to put it out there.



P.S.

The world is a big place, I will be very surprised if I'm the first to come up with this...

Wednesday, April 26, 2017

Solving the Wrong Problem: My Take on 3DOR 2017

This week I attended the 10th 3D Object Recognition workshop, which was held as preliminary part of EuroGraphics. To sum up my impression from the vast majority of the works, I felt that the academia is working on the wrong problems.

Specifically - much of the research presented (though not all) dealt with CAD models, indexing them, analyzing them, finding similarities, descriptors, and transformations for them. This is, in my opinion, an approach that has lost touch with the actual state of data in the real world, and is driven mostly by cultural reasons such as the existence of previous work to rely on and CAD-based benchmarks to run. 

Scanned and CAD chair (not the same chair, of course)

In the past two years, we have seen an increasing number of 3D scanners and depth cameras, ranging from low quality devices - costing just a few hundreds of dollars - to expensive, high resolution, industrial devices, that produce impressive results. The tables have turned, and scanned data, which was a small minority, can now be easily generated in large quantities. This data ranges from lab-scanned toys, to huge urban scenes scanned by drones, nevertheless, the majority of the algorithmic works presented in 3DOR dealt with CAD models. I believe that there are two main reasons for that:

  1. There are almost no tagged scanned data sets, and none of substantial size (SceneNN, perhaps the best work so far, has only 100 scenes).
  2.  Most existing methods are tailored to CAD models, and are thus sensitive to one or all of the following traits, common in scanned data:
    1. High-frequency noise due to scanner accuracy issues.
    2. Incomplete models due to occlusion or simply unscanned sides of the object.
    3. Holes.
    4. Open boundaries.
Reason (1) becomes even more of an issue to anyone wishing to follow the deep learning trend, as it - as most ML approaches - requires tagged data sets of considerable size. 

A scanned scene from SceneNN data set

I believe that there are two types of data the academia should address seriously in the upcoming year or two - scanned small scale data, such as singular models and indoor scenes, and scanned large scale data, such as entire buildings, and streets. With the upcoming depth camera in iPhone8, we can expect a proliferation of the former, while the latter will be the result of increasing use of industrial scanners in security, construction, and drones.

In my experience, algorithms that worked well with CAD models, more often than not prove to be useless for scanned data. This is bad news for reusing much of previous work as-is, but is also good news, as it brings hope that we'll see exciting new approaches to 3D retrieval and recognition in upcoming years, that will truly have an effect on technology being developed, and thus on people's lives.


Tuesday, March 28, 2017

The UX of Work Emails

The polite way to get a co-worker's attention to discuss an issue without bothering someone, is to send an email or some other form of asynchronous communication.


In recent years I have found that it is very useful to think, when writing an email, more like a UX designer than a writer. This is because an email is meant to induce behavior, specifically, in addition to getting my message across, I want a quick response.

So when composing an email I...
  1. Use bullet points: they make replying to different points in the email much easier, and requires less explaining when doing so.
  2. Highlight important stuff, and in general, make scheming through it easy and informative. 
  3. Use images when I can, because people love images and respond to them more than to text (as any Facebook page manager will tell you). This is not to say I start digging through image banks, rather - I use plenty of screenshots all the time (I use Shutter on Ubuntu and the Windows Snipping Tool on Win7). And yes, the occasional meme, if I have a big block of text with no relevant image to accompany it.
  4. For a long email - start it with a tl;dr version of it, summarizing the whole thing in a sentence or two. This works in a similar way to a news story's headline - it gives the appropriate framing and context.
  5. Use the subject line to help recipients sort through their emails. I believe that the subject line is usually ignored when actually reading an email. It is read when people sort through a long list of emails, or get a notification. It therefore should be short, and useful in reminding the reader how this email differs from others on the list.
  6. Use code that looks like code. I use this website to format the code parts of an email and make them look like code. It helps readability, and looks cool.
    int RandomInt() {
      return 4;
    }
  7. Make it cross-platform. People read their emails on various devices, so I try to be very conservative with attachments, and limit them to PDFs and other easily-read formats, when possible. When I have some 3D model in a proprietary format that illustrates my point - I may send it, but I send an image of it as well, so that colleagues can make sense of it even if they are on the train, using their phone to read emails.
An image. Yeah, I know, amazing.


Tuesday, January 31, 2017

Generate a Puzzle for Your Math Class in 5 Minutes


There's a famous puzzle, often referred to as the Zebra Puzzle. It goes something like this:


  1. There are five houses.
  2. The Englishman lives in the red house.
  3. The Spaniard owns the dog.
  4. ...
  5. ...
    ...
Here follow many more facts detailing who lives next to whom, and which pets they own and so on, until the final question is given: Who owns the zebra?
There is no quick and clean solution for this puzzle that involves a single big "Aha!" moment, but rather a series of steps, making ad-hoc assumptions, and trying to see if they sit well with each other and with the facts given, until a solution is found.


Constraint Satisfaction Problems

This puzzle belongs to a class of problems called Constraint Satisfaction Problems, or CSP. 
While CSPs in general are hard to solve, even for computers, small instances of them are both easy to create, and make for popular puzzles. An example of such puzzle is Sudoku.

Floors Puzzles

Here's a really easy CSP I just came up with:
  1. There are 4 floors in the building.
  2. The rooster does not live on the top floor.
  3. The cat lives two floors under the lion.
  4. the dog lives in a lower floor than the rooster.

Which animal lives on which floor?
Spoiler alert: if you like to solve this puzzle, do so now, since the next section will reveal the solution. The puzzle is, by the way, so easy that I suggest you solve it yourself anyway, just to get an understanding of what it feels like.
An animal that does not live on the top floor

This is a nice puzzle for 10 year old students. A similar puzzle with 3 floors will be suitable for 7 year olds, and naturally, more floors can make it increasingly harder to solve.


Create Your Own Floors Puzzle in 5 Minutes

  • Choose animals and number of floors. 
  • Draw a diagram of which animal goes where, e.g.:
  • Write down rules that describe your building - e.g. "the dog lives 3 floors below the lion".




Over a certain number - some of the rules will be redundant, that is - the puzzle may be solved even if those rules are not given, but don't worry about that. As a rule of thumb - the number of rules should be slightly smaller than the number of floors.

Checking Solutions

Note that it may be the case that there will be correct solutions that are not identical to the one you used to write the puzzle, but that's okay, because the puzzle is about finding a solution, not the solution. While it is possible, in principle, to check whether the rules you chose have more than one solution, it will probably take more than the 5 minutes promised in the title of this post... 

So there you have it - generate a puzzle for your math class in five minutes or less. Enjoy.



P.S.

On a less-practical note, about a year ago I wrote a post tying "good" puzzles to NP-complete problems, and this current post is really just another example of that.

Wednesday, October 5, 2016

Notes from my "Intro to Information Theory" time


For two years, starting October 2010, I was a teaching assistant in the Introduction to Information Theory course at the Hebrew University, which was taught by Prof. Michael Werman, and the following year by Prof. Michael Ben-Or.
During that time, I a weekly two-hour tutorial, which was sometimes an extension of what was discussed in the lecture, and sometimes (as in the case of error correction codes) simply dealt with other topics.

After writing the previous post on check digits, I dug up the old notes, and found them to be quite helpful in refreshing my memory on the matter.

So, hoping it will prove helpful to others as well, and under an assumption that I'm not violating anyone's copyrights, here they are: