In “The Oregon Experiment”, the authors consider the following budgeting scenario.
Suppose you’re in the following situation: You have ~$2,500,000 that you’d like to allocate to construction projects in your community.
There are many ways you can allocate this money to projects of varying sizes. Consider the following options:
category

number of projects

rough total cost based on averages


A < $1000

1

$500

B $1000$10,000

1

$5,000

C $10,000$100,000

1

$50,000

D $100,000$1,000,000

1

$500,000

E > $1,000,000

1

$2,000,000

totals

5

~$2,600,000

category

number of projects

rough total cost based on averages


A < $1000

1000

$500,000

B $1000$10,000

100

$500,000

C $10,000$100,000

10

$500,000

D $100,000$1,000,000

1

$500,000

E > $1,000,000

⅒ th of a project

$500,000

totals

~1100

$2,500,000

category

number of projects

rough total cost based on averages


A < $1000

500

$250,000

B $1000$10,000

50

$250,000

C $10,000$100,000

10

$500,000

D $100,000$1,000,000

1

$500,000

E > $1,000,000

1

$1,000,000

totals

~550

$2,500,000

The main conclusion is that, for the same amount of money, we can chose to either support lots of small projects, or few total projects.
One of the main premises of the book, is that good change is made locally, by locals. In this way, schemes 2 and 3 are a significant improvement over Option 1.
One of the best initiatives that Victoria is doing along these lines is the “Pick My Project” program.
However, I think it’s also interesting to think about this in relation to other areas:
Over on the Silverpond GitHub account we recently opensourced the little dancebooth project that we’ve been playing with for a while now.
It’s a very simple little wrapper around their PoseNet demo.
It’s a neat little repo, that runs entirely offline, and, from a webcam or anything that can be accessed via the browser, can be used to capture dances in three forms:
You’ll find all of these saved under the dateandtime in the ./savedvideos
folder. The JSON can be used to recreate the entire dance in any form you wish; it gives the coords of the various joints.
At the present moment it the dance capture only starts when the people in frame all raise their hands above their heads.
Feel free to change and play with as you wish!
(This series of posts originally appeared on the Silverpond blog)
Well, we made it to Innsbruck, Austria!
It was a huge journey to get here, and I have to tell you, Austria in general, and Innsbruck in particular, is absolutely beautiful.
On the train ride from Vienna to Salzburg, we spent most of the time looking out of the window taking photos. The weather is amazing; it’s perfectly warm and sunny, and it’s amazing to walk around the town and just see mountains everywhere.
But, I’m here, of course, on business, and in particular, primarily to attend the QML+ – Quantum Machine Learning … Plus – conference.
It’s the night of Day 1, so here’s a review of what happened:
by Hans Briegel
Hans Briegel, famous partly for his involvement measurementbased quantum computation (and of particular relevance to me, because this was part of what my Masters work was about) gave an overview talk about why we might care to think about how quantum computing could play a role in machine learning.
I quite enjoyed one of his ideas, which is thinking about how “Embodied AI” relates to the ideas of “information is physical” insofaras both imply that in order to think about the primary subject, we need to involve physics. This has been particularly fruitful in physics and information theory, and relates to some very farreaching ideas, such as black holes.
He used this as motivation to study a “Artificial Agent” (or in standard lingo, Reinforcement learning) in the quantum setting:
His observation is: What is quantum? There are four options:
He noted that in the QuantumQuantum setting, there are some foundational open problems:
I don’t think even these questions are quite clear to me, let alone the answers; but still interesting to think about.
One idea/question I had is: What is the simplest truly quantum reinforcement learning problem?
by Patrick Rebentrost
Next up was Patrick, who gave a farreaching talk on a variety of topics that he and his colleagues have been researching over the last few years.
He started off by reminding us of a bunch of challenges that he posed in a prior paper:
Next, he talked about a quantum algorithm for training a socalled “Hopfield” neural network via the Hebbian learning procedure.
The Hopfield network is simply one in each every node is connected to every other node; and there is only one layer, and every node is both input and output (there are no layers, essentially). This may seem odd, and you should rightly wonder how you could train such a thing. One way to train it turns out to be the socalled “Hebbian” learning, which is inspired by the Human brain. The idea is captured by the phrase: “Neurons that fire together wire together”. With this idea in hand, it’s possible to develop a scheme to encode all this into a quantum computer, and perform all the updates and training. You can find more here.
For the everyday deep learning person, these ideas may sound a bit odd. Rightly so, because it’s not standard practice. Essentially the only reason to focus on these for in the quantum machine learning setting is that this is a network for which we can come up with a scheme to implement it on a quantum computer. A natural question is: Can we adapt the Hopefieldnetwork techniques to work with multiple layers? I tentatively feel like the answer could be “yes”, but I haven’t thought a lot about it.
The next paper he talked about is this one: Quantum gradient descent and Newton’s method for constrained polynomial optimization.
I happened to read this one when it came out, because it was quite a big step. Previously, we had no idea how to even compute a quantum gradient, so this contribution was huge.
Unfortunately, the main problem of this paper is that the algorithm gets exponentially slower as the number of training steps increases. This is, at least naively, incredibly problematic for typical machine learning, where the number of steps is in the hundreds of thousands. In the paper they make the argument that oftentimes good results can be achieved after a very small number of steps, but it’s not clear to me how practical this is.
His final topic was Quantum computational finance; he was basically out of time, so didn’t go in to much detail, but the main idea is that again, using a standard technique in quantum computing called “amplitude amplification”, one can achieve a quadratic speedup in a certain kind of derivativepricing. It turns out that banks and genuinely interested in these techniques, because being able to price something in say ~2 days, instead of 7 days, is a significant advantage marketwise.
Partick ended with a funny remark along these lines, which is that, the beauty of working in the finance world is that you don’t need to prove anything, you just simply build it, and let it go around making trades in the market; if it doesn’t work, you simply lose money!
Over lunch, I had a really nice chat with Pooya Ronagh from 1Qbit and Ronald de Wolf. We chatted largely about how the practical everyday machine learning could be aided by quantum techniques. Ronald pushed hard to understand what areas quantum researchers should focus on, and Pooya and I were trying to come up with ideas. Pooya had an interesting comment that, in many ways, faster machine learning isn’t super useful, because for the physical cost of a quantum computer, you can already buy significant hardware and get great results. So bad results faster doesn’t really help, in a foundational way.
Some thoughts we had is that maybe just flatout alternatives to gradient descent would be interesting; i.e. we know there are areas where gradientdescent style optimisation is not great: translation, program synthesis, neural architecture search, etc.
In any case, it was a very inspiring chat, and I was really glad to have met them!
by Wolfgang Lechner
This, I must say, was quite technical, and I didn’t quite follow most of it. But I did get the general idea.
The main tool of quantum machine learning is the socalled HHL algorithm (see also: Quantum linear systems: a primer). One thing it requires is efficient loading of the training data. It turns out that typically, if you want to load the training data into a quantum algorithm, in general you’ll need to do an exponential amount of work in the number of training samples. Which is hugely problematic. I think I need to understand this a bit more, but at least the basic idea was clear: the dataloading needs to be sped up.
The main contribution of this work is that, through a rather elaborate procedure, partially described here: Programmable superpositions of Ising configurations (but more in upcoming publications), it’s possible to prepare the required state by encoding it into a Hamiltonian, and then letting the Hamiltonian evolve via adiabatic evolution. How? Hebbian Learning, evidentally! I admit that I didn’t follow most of this talk, but I do think this kind of thing is quite interesting, and there’s definitely a need to solve this general, and reasonably embarassing problem.
We’re back. There first two talks were quite great, and there was another that was interesting and is worth a mention.
by Aske Plaat
Even though it had a reasonably uninspiring title, this talk was actually excellent, and should’ve, in fact, been the opening talk of the conference.
Aske introduced some motivation, and introduced some simplifying assumptions about AI to try and cut off typical arguments about what it means to be “intelligent”. He defined his working notion, which is “to be intelligent is to act intelligently”, which later was quite controversial to a number of people.
He had a unifying way of explaining why we’ve seen such a boom of AI recently, which is:
I think this is a nice way of phrasing it. He then dove into the various parts in more detail, starting with algorithms.
He introduced what he sees as two main camps of machine learning:
Connectionist AI, as he sees it, is the kind that we all know and love: Deep learning, neural networks, “bottomup” reasoning, function approximation, etc.
Symbolic AI, as he sees it, is more related to philosophy, logics, ontologies, expert systems, planning, QLearning, and other kind of predefined “slowthinking/highlevel reasoning” ideas.
The main point he makes with the distinction is that maybe more merging between the two schools of thought needs to take place. He gives the example of AlphaGo as being a case where the two ideas merged. Another one I thought of is the idea of Algebraic Machine Learning which certainly has some grand claims, but is at least mildly interesting for it’s ideas.
He then made some comments about how speed is also relevant, and without it we wouldn’t see such a boom. Again this is of interest to quantumcomputing types, because being faster than classical computers is fundamentally what the field is all about, and that’s where there’s been a lot of focus recently (i.e. quantum speedups over classical algorithms).
Aske also noted the abundance of benchmarks for classical machine learning, which became a theme for a few of the questions during question time. In particular we discussed who, if anyone, and how, if possible, to come up with some good benchmark datasets and problems for quantum machine learning. Presently, noone has anything good along those lines.
He then noted some challenges in classical ML, and made the observation that simply achieving a speedup won’t solve these problems (for example, the adversial attacks, or the delayed credit assignment problem). The claim is that we need to put some effort into what truly quantum algorithms might look like.
The main thing I got out of the talk was the idea that we should be thinking about making new benchmarks for quantum machine learning.
by Alejandro PerdomoOrtiz
Alejandro is very experienced in this field, it turns out. He’s been leading a team at NASA working on QML for the last 5 years, and now has moved to Rigetti, where he’s conducting research on the frontiers of quantum machine learning (also, Rigetti has a quantum cloud service coming …)
At NASA his drive was to drive interest in the practical usage of the quantum devices that NASA had purchased (in particular the DWave).
He noted that quantum chemistry, and the simulation of quantum systems, was the most natural idea, and everyone should be looking at it. But furthermore, he was tasked with thinking of other problems that could be mapped to these particular optimisation devices. Naturally, one idea is just straightforwad discrete optimisation; finding some satisfying assignment of variables for the minimisation of some particular cost function. And he conducted some early work here mapping protein folding to a certain kind of optimisation problem.
He echo’d Aske’s thoughts and said that we should be focusing on designing new algorithms, over just simply speed.
One of the most memorably quotes from his talk was “Look for the intractable, the more intractable the better, for me”.
One thing he spent a bit of time on was using the DWave to again implement one of these Hopfield networks (he called it here a “fullyvisible model”), on a simplified digit dataset. Turned out it worked! Which essentially demonstrated that it was indeed possible to map a ML problem onto the device, and then have the device learn it’s own weights (couplings, here) which would allow it to do well at generating new digits!
Following this work, they then observed that infact they could train an autoencoder entirely classically, and then use the embedding vectors for all the training data asin the setup above to train a kind of hybrid generative system:
I must say that I found this both interesting and confusing. It’s interesting because it’s a great way to use the complicated device to do “real” work, even when it has alone a very small amount of input nodes (it was something like 46, here, for this device). But it’s also confusing because most of the “juice” in the network is in the classical weights, not in the embedding vector itself. When I asked Alejandro about this, he said that it was mainly a way to demonstrate the hybrid set up, and that over time the idea is to make more regions quantum, and see how that changes things. I find it very interesting to think about how one would even go about jointly training a hybrid quantumclassical system.
The next idea he covered was the learning of quantum circuits see also Differentiable Learning of Quantum Circuit Born Machine.
This I think is a particularly great idea, and his approach was to focus on generate certain kinds of entangled states, with great results. They managed to find a state that has more entanglement, compared to the standard one, with this scheme. They also made some interesting observations about the expressive power of the depth of the circuits and what kind of states they can possible prepare.
His final insights were:
For lunch, Gala and I decided to enjoy the beautiful park right next to the venue! By chance, there was a beer garden inside!
by Alexey Melnikov
This talk was essentially another kind of programsynthesis problem, but this time in the language of optical elements. The idea is that, given some set of optical elements, and some number of qubits, how can we find all the possible sets of operations that make entangled states?
There new idea is to use a ReinforcementLearninginspired framework called “Projective Simulation”. I must say that I found the framework a little odd, but they did get good results, and it’s available as a python library for you to experiment with!
by Rainer Blatt
This talk was a bit oddlyplaced. It was an overview of how quantum computing works, and an introduction to the trappedion style of quantum computing.
The last event of the day was a very large panel of most of the speakers (~10 people) with a bunch of questions prepared by the organisers that were aimed to be thoughtprovoking. The best comment that came out of the entire discussion was from Matthias Troyer:
“That’s how you get a quantum advantage with zero qubits”
He was describing the recent work by Ewin Tang: A quantuminspired classical algorithm for recommendation systems (which we actually already covered here).
Here’s the list of open/interesting topics from today:
What does truly quantum ML look like? Let’s stop trying to map classical algorithms to quantum ones, and just make up new ones
Today, there were only 3 talks; and we had a free afternoon! So we climbed* the big mountain!
*: Okay okay, by “climbed” I mean “took the lifts”. But there were 3 lifts!
by Ronald de Wolf
This was a great and classic talk in the same vein as many talks in the theory of quantum computation.
Ronald addressed the natural problem of what you could do if you your data was given you to as a quantum state (thereby ignoring all the problems that have been brought up with QRAM in the past few days; let’s just suppose we have the state!).
The proposal is to consider supervised learning in a very formal sense; imagine we have a function: f : {0, 1}^{n} → {0, 1}.
Then, we can think of it as a supervised learning problem where we have n binary features, producing a single binary output, and we have our examples in the form: (x, f(x)).
Ronald wanted us to consider the socalled sample complexity, i.e. how many times do we have to evaluate f, instead of the more standard time complexity. In this sense here, the ideas are at least related, because fewer samples will take less time.
In general we’d need 2^{n} examples to learn f fully, but we’ll want to do efficient learning, so we’d like to learn from far fewer queries than that.
Ronald preferred to work in the framework of PAClearnability, which was introduced by Leslie Valiant in 1984. Leslie has a nice readable book on the topic (I’ve unfortunately lost my copy a while ago) which is well worth a read.
It turns out that in 1995, Bshouty and Jackson introduced a quantum version of this notion (see also Quantum DNF Learnability Revisited).
Their idea is to consider the state of training data:
$$ \sum_{x \in \{0,1\}^n } \sqrt{D(x)} x, f(x)\rangle $$
To demonstrate a speedup, suppose that we have the uniform distribution over the samples; so then our state becomes
$$ \frac{1}{2^n} \sum_{x \in \{0,1\}^n } x, f(x)\rangle $$
The main idea is to hit this with the Fourier sampling tool, that is a classic trick in quantum algorithms. The essential idea is, firstly support that (x)= ± 1, then we can do another standard trick, the Hadamard transformation, and obtain the state
$$ \frac{1}{2^n} \sum_{x \in \{0,1\}^n } \hat{f}(s)s\rangle $$
where
$$\hat{f}(s) = \frac{1}{2^n} \sum_{x} f(x) (1)^{s \cdot x}$$
.
The point here is that when you measure this final state, the state you see is s with probability $\hat{f}(s)^2$. For a certain choice of f, namely when f is linear mod 2, then you can learn the function perfectly in exactly one query. Classically you would require at least n queries. Great reduction!
He then went into a few more examples, getting a little bit more technical each time; one was for the socalled “Coupon collector” problem, which is simply: imagine you get a random baseball card from a store, how many times do you need to visit the store before you have the entire set of cards?
Again, because this is a uniform problem, we can use similar techniques to improve on it. Classical, one can find that the expected number of store visits (samples) is approximately Nlog N (where N is the number of cards in total), but quantumly it can be done with O(N) samples.
Perhaps of interest, was some conclusions that they were able to show that no quantum speedup can be obtained when for all distributions; i.e. there are some bad distributions that we will just struggle with. I don’t think anyone finds this particularly surprising, and shouldn’t have a big impact on realworld problems, because typically, most data isn’t, shall we say, adversially prepared; the distributions tend to be wildly different (a natural example being the set of all images of mountains as a subset of all possible images).
The main idea here is that by utilising standard tricks from quantum algorithms we can get significant speedups when we know something about the distribution that we are learning from.
by Mario Ziman
This one was a little bit technical, but it had some interesting ideas.
Mario phrased the problem of quantum machine learning as follows:
The point being that in typical ML we want to modify the function f, and in quantum ML then we wish to modify the unitary U_{f}. This turns out to be quite problematic, because this function is truly quantum, and we know that quantum states suffer from the nocloning principle: you cannot copy an unknown quantum state; so you can only use it once. Mario mentioned that this results in a Noprogramming theorem, which means that we cannot perfectly store an unknown quantum transformation (i.e., even if we find a U_{f} that works well; we can’t save it! We have to know how many times we want to use it in advance!).
However, it turns out we can still make some progress by, instead of requiring some kind of exact copy, we aim for probabilistic performance. The only remaining thing I got from this talk was that probabilistic learning, in their sense involving some kind of “probabilistic Q learning” is related to quantum teleportation. You can read about this work more here.
by Renato Renner
This was a talk I was quite excited about. The paper for it is here. They’ve also made the code available, in TensorFlow.
The fundamental idea is really nice: Maybe we can build a neural network to act exactly as a standard human physicist would: they shall observe data, try and form theories, ask questions of their theories, and then check the answers.
This is very nicely expressed by the picture from the paper:
The point is that they train an autoencoder on the data, but the put constraints on the latent vector so that the entries by adding the mutual information between them on to the loss soas to require independence; i.e. the parameters that the network learns should be independent.
They then introduce this idea of questions and answers. Unfortunately, I think the way this idea doesn’t go far enough is by, it seems to me, treating the answers as the input data itself; so that truly this network only functions as a special kind of autoencoder (at least, this is how it is in the code, and as far as I can see in the paper).
In any case, they’re able to show several problems where it is able to look at experimental data and where the latent variables are correlated to the terms that they expected to see in the equations that model these phenomena. I think that’s pretty cool.
One thing Renato noted quite strongly was that perhaps this isn’t super surprising, and maybe the network received a lot of help by the way the data was sent to it. I think that this could be mitigated, maybe, by thinking a bit more about the quesiton/answer set up, and more generally thinking about how we can allow the network to reject data samples.
Some ideas I had during this talk related to the kind of “falsifiability” ideas; namely that if this system is forced to come up with theories for data, then how can we also ensure that the theories it has can be proven wrong?
My other idea was, what if instead of just having a latent variable, i..e an independent variable, have an entire neural network in there, that could perhaps serve as an “independent explanation”.
Overall, I quite enjoyed this talk and idea.
The interesting things that came to me regarding today were:
Well, it’s Thursday night, and I’ve just finished up my last day at the conference. Tomorrow, we’ll be heading on our (short) holiday! While QML+ formally has two more talks tomrorrow, they are less relevant to me personally, plus we need to get a head start to make it to some waterfalls!
Here’s my summary of the talks I attended today!
by Matthias Troyer
With an amusinglytitled talk, Matthias is the master of classical simulation algorithms for quantum processes. He spends most of his time working on the software side, trying to demonstrate practical quantum speedups for optimisation problems.
As with most of the other talks, he described several pieces of work. The first was a neural network that could be used to learn a quantum wave function, and then used to find phases and amplitudes of given states, and compute other properties.
Their setup was the (seeminglystandard) Restricted Boltzmann Machine, where the input was whetherornot there is a zrotation on the given qubit, and the output being the inner product with some state s⟩.
But, nonstandardly, the weights of this network are actually found using what they refer to as “reinforcement learning”, but is actually something called “Stochastic Reconfiguration”. Once they find initial values for the weights, by looking at the Hamiltonian of the particular system, they then finetuned, if they want to compute properties that depend on time. It’s a little bit involved, to say the least.
Anyway, having done this, they do acheieve some nice results. They are able to use their neural network to compute various properties quite well.
Later, they applied an RBM again, but without the weird “Stochastic Reconfiguration”, and were able to get very good results in learning quantum states.
He then spent a bit of time covering his work on quantum annealing. In particular, in that worked they observe that quantum annealers seem to be fated to always produce unfair samples of the potential states; i.e. not every state has equal probability to appear. Ingeniously, they came up with a classical simulation of quantum annealing that is actually faster and more accurate. Even more ingeniously, they show that infact they can implement the classical simulation as a quantum process, and again get a speedup, for a total of a quartic (2 times quadratic) speedup!
All this resulted in the numerical comment that if there is any quantum annealing problem you’re running classically for more than 1 day, you’ll go faster if you use a quantum annealer.
One of the interesting conclusions for this part of the work was, when you’re simulating adiabatically evolving something classically, sometimes it’s better, if you want tunneling, to evolve very fast (the adiabatic theorem would tell us we need to evolve slowly). He demonstrated this in an amusing way by saying that he could tunnel through the wall in room if we would just close our eyes for 30 seconds, instead of 30 microseconds.
His other summaries were:
Unfortunately, a talk I was looking forward to, by Franceso Petruccione, probably related to this work was cancalled, so Gala and I went for a long hike instead. We almost got lost, found beautiful forests, found a beautiful field that reminded me of Jurassic Park, lost faith in ever seeing the bottom of the mountain again and eventually made it back to the lifts alive.
This impromptu adventure meant we just got back in time for the last talk of the day.
by Giulio Chiribella
This talk was essentially based around this paper.
The main point is to think about a framework for causal hypotheses, and then see how classical and quantum approaches compare. The setup as like so:
We think of curlyC as some kind of unknown process (for example, node.js
, ha ha ha), and then ask ourselves: What is the causal relationship between B and A? And between C and A?
The setting Giulio proposes is that we want to be able to determine exactly, from a given set of hypotheses, which one is correct. Here, imagine the following:
The question is: Who can do better, as a function of the number of trials, to determine which hypothesis is right? In order to be able to make progress, we allow ourselves interventions; i.e. that we can feed data into A, and then use that to make subsequent queries to curlyC.
For reasons I don’t really understand, in the paper they claim that classically, if the dimension of all variables is finite and fixed to d, then, if B (or C) is dependent on A, then that means that the function mapping A to B is invertible. With such a constraint, it’s easy to see that it’s possible to determine the difference between the two hypothesis. The value of interest to them is the “discrimination rate”, as the number of experiments is performed. They find that it is log d. Quantumly, they find that they are able to differentiate the two hypothesis with discriminatio rate 2log d. This, in the theory they’ve developed, is exponentially better than the classical case. Great!
I left this talk a little bit confused, but at least vaugely interested in the idea of quantum causal modelling.
The interesting things that came to me regarding today were:
Overall, I’m inspired by quantum machine learning. I feel like there’s heaps of cool things to do.
Unfortunately, I’m disappointed by some things about this conference. Having come from so far away, and wanting to maximise my time the best way, I found it frustrating that even the titles for most of the talks weren’t known in advance.
I found the conference events and overall feeling to be very noninclusive. There was lots of mention of people working in ML/QC as “guys”; there were lots of incrowds and, while there was lots of talk of wanting to mix with the “machine learning crowd”, people were somewhat skeptical of me not being associated with any university, and attempts to organise people as either “machine learning/classical” and “quantum”. Further, there was also no mention of a code of conduct.
Sarah Moran, of Girl Geek Academy once gave a talk about “micro positiveactions” (or something, I can’t remember the name) but the ones that stuck out to me were:
These are great rules of thumb for any organisers to keep in mind. If you have more, please let me know!
Overall, it would be great to see these academic conferences put significant effort into making their conferences feel much more welcoming to all types of people.
The first thing to do is to stop whatever else you are doing.
Now stand or sit in a comfortable position.
However you wish.
Notice your breathing.
As you breathe in,
be aware
that you are
breathing in.
As you breathe out,
notice that you are
breathing out.
Notes on Programming
Many of us spend a lot of time programming. We program at our jobs, we program at cafes, we program at home. To program, in this blog, means to program in such a way that you enjoy programming, to program in a relaxed way, with your mind awake, calm, and clear. This is what we call programming, and it takes some training and practice.
There’s lots of talk about the Ethics of AI at the moment. As with any research, there’s too much for any one person to read. Here’s a bunch of papers that I’ve collected haphazardly in the early part of this year:
One thing I wanted to think about is, speaking as someone working in this field and interested in making changes in my daytoday life, what kind of tools or ideas would be useful for me? What should I do?
Alongside this thought, another thought I had is that somehow the big lists of rules feel very impersonal and disconnected from my experiences. I also feel a little bit unsatisfied about optin rules. Here’s a few from the around the place, that I’ve seen:
I have a few problems with these rules:
The positive aspects of them are:
So, what should any given engineer working in this area do? One thought I’ve had recently is a simple one: Let’s just aim at building empathy for the people that will be affected by our software.
This is reasonably actionable, say, with local groups by organising meetings between technical people and the people that may be affected. I.e. in the medicalAI setting, let’s organise regular catchups between the engineers, the doctors, nursing staff, and hospital adminstration types, along with perhaps patient representatives.
In the setting, of, say, law software, again we just set up regular events for the two groups to chat through issues, work together on small projects, and build a mutual understanding of difficulties.
I think this approach is a bit nicer than, say, creating a new set of rules that make sense for us locally, and then forcing people to follow them. One idea I like about the empathybased/collaborative approach (or “humancentered design”; another term for this kind thing), is that it allos people to adapt to local circumstances, which I think is really crucial in allowing any one person to feel like they have some control over the application of any rules they come up with, and thus getting them to actually take an interest in enforcing them in their organisation.
So, my new rule of thumb for this ethicsrelated AI stuff will be: Can I meet with some of the people that will be affected? What are their thoughts? What problems are they working through and what are they interested in?
As always, I’m interested in your thoughts on the matter!
I stumbled across this blogpost of Corentin Dupont where he put together a library that allows you to modify your hakyll blog so that you can have inline diagrams! As anyone knows, this was amazingly exciting to me, because I love diagrams
.
So I quickly tried to set it up; but, much to my sadness it didn’t immediately work.
Luckily, however, I was able to make it worked by hacking around in the two relevant repos:
The main result is a function, pandocCompilerDiagrams
, that I included into my hakyll site file like so:
match "posts/*" $ do
route $ setExtension "html"
compile $
(pandocCompilerDiagrams "images/diagrams" <> pandocMathCompiler)
>>= loadAndApplyTemplate "templates/post.html" postCtx
>>= saveSnapshot "content"
>>= loadAndApplyTemplate "templates/default.html" postCtx
>>= relativizeUrls
And so now, I can have inline diagrams! Check it out:
Imagine we had a circle:
example = circle 1
But now, what if the circle was repeated 5 times
example = hcat (take 5 $ repeat (circle 1))
Cool!
To celebrate, let’s draw the Sierpinksi triangle:
The basic building block:
sierp d = d === (d  d) # centerXY
example = sierp (triangle 1)
Let’s go!
sierp d = d === (d  d) # centerXY
example = foldl (\d _ > sierp d) (triangle 1) [1..3]
Colours!
import Data.Colour.Palette.ColorSet
color n = rybColor (n*2)
sierp d n = d1 === (d2  d2) # centerX
where
d1 = d # bg (color n)
d2 = d # bg (color (n+1))
example = foldl step d0 [0..5]
where
d0 = triangle 1 # lw 0
step d n = sierp d (n*2)
Happy days!
I’ve been reading “Surfaces and Essences” by Doug Hofstadter and Emmanuel Sander.
You can essentially judge this book by it’s cover: It argues that analogies are the key technique we use to think and understand. I quite love this book, in particular because it presents lots of interesting ideas for people interested in the topic, and for people interested in AI and machine learning.
The reason I enjoy this book so much is because I think it presents a very strong task for machine learning people to tackle; namely to build a system that is capable of this analogical reasoning. One thing that’s true of all modern machine learning systems it that their knowlege is very “narrow” and, almost all of the time, the bounds of it are determined entirely before training.
In any case, we’re focused right now on translation. Doug recently wrote about his thoughts here: The Shallowness of Google Translate.
I’m not going to go into a lot of detail here; I just want to track the progress of a specific phrase that Doug and Emmanuel hvea in the book. They started tracking it in 2004, and it being 2018 now; 14 years later! I wanted to see how things had progressed.
Here’s the complex paragraph (in French) and the task is to translate it into English:
Original
Parfois, le succès ne fut pas au rendezvous. On a beau y penser très fort, le bon numéro ne sort pas forcément. Sagan prenait échecs d’auteur dramatique comme les revers casino, avec respect pour les caprices de la banque et du ciel. Il faut bien pedre un peu, pour mieux savourer la gagne du lendemain. Qui ne l’a pas vue « récupérer » en quelques quarts d’heure les pertes de toute une nuit ne peut comprendre comme c’est joyeux de narguer le sort.
Humantranslation by Doug and Emmanuel
Sometimes things just didn’t work out right; no matter how har she wished for it, the dice simply wouldn’t come up her way. But Sagan always took her failures as a playwright much as she took her gambling losses, acknowledging the arbitrary whims of the house and of divine fate. After all, everyone has to lose now and then, so that the next day’s victory will taste all the sweeter. And if you never saw her win back a whole night’s losses, often in well under an hour, you just can’t have any idea of the glee she took in laughing in the face of destiny.
Then here are the translations over the years (the first two coming from the book itself):
Google Translate, 2004
Sometimes, success was not with go. One thinks of it in vain very extremely, the good number does not leave inevitable. Sagan took the failures of dramatic author like in the reverses of casino, with respect for the whims of the bank and the sky. It is necessary well to lose a little, for better enjoying gains it following day. Who did not see it “recovering” in a few fifteen minutes the losses of a whole night cannot include/understand as they is merry of narguer the fate.
Verdict: Terrible.
Google Translate, 2009
Sometimes, success was not there. It was nice to think very hard, the proper number does not necessarily spell. Sagan took the failures as a dramatist such as backhand casino, with respect to the whims of the Bank and the sky. It must be losig a little, better enjoy the gains overnight. Who did not see “recover” in a few minutes lost a whole night can not understand how happy it is the sort of taunt.
Verdict: Still terrible.
Google Translate, 2018
Sometimes the success was not there. We can think about it very well, the good number does not necessarily come out. Sagan took dramatic draftsman’s chess like the casino setbacks, with respect for the whims of the bank and the sky. It is necessary to lose a little, to better savor the gain of tomorrow. Whoever has not seen her “recover” in a few quarters of an hour the losses of a whole night can not understand how happy it is to taunt the spell.
Verdict: Still terrible, 14 years later!
It’s very interesting to think about how to build systems that could conveivably translate phrases like this “properly”, by using the ideas from the book.
Recall the standard ReLU
function from neural networks:
$$
\texttt{ReLU}(x) = \max(0, x) = \begin{cases}
x & x > 0 \\
0 & \text{otherwise}
\end{cases}
$$
All wellandgood. But what if I want to apply a function to the lowerhalf of this function, instead of setting it to 0? Infact, what if I want to apply a function to the tophalf as well! And while we’re at it, why should the inflexion point be 0 always?
So, here’s the fugu
function:
$$
\texttt{fugu}(x, f, g, p) = \begin{cases}
g(x) & x > p \\
f(x) & \text{otherwise}
\end{cases}
$$
Then, ReLU
(x)=fugu
(x, 0, id, 0), if you wish.
Here’s the fugu
function in Python TensorFlow:
def fugu (x, f, g=lambda x: x, point=0):
cond = tf.less(x, point)
return tf.where(cond, f(x), g(x))
There, tf.nn.relu(x) = fugu(x, tf.zeros_like)
.
What kinds of cool/useful functions can you build with this?
Exercise: Can you use the fugu
function to build a kind of “stairwaytorelu” function?
One of the main reasons I loved this idea so much is that almost all machine learning that you see concerns itself with fixed output dimensions; at least for images. The cool thing about the CPPN is that it maps pixel coordinates, along with some configurably latentvector $\vec{z}$, to rgb values:
$$
\text{cppn}(x, y, \vec{z}) = (r,g,b)
$$
This is cool because, there is a value defined for every point! So you can use these things to create arbitrarilylarge pictures! Furthermore, for a given $\vec{z}$ we can make higherresolution images by evaluating the network over different widths.
At Silverpond we’ve put this idea to good use in our upcoming event at Melbourne Knowledge Week.
In any case, here I’d like to document my playingaround with the idea of using CPPNs to generate 3d landscapes.
I’ve put together some pieces of code here: cppn3d. Thanks to the amazing MyBinder you can even run the notebook online, right now, and start generating your own cool images!
To use the Python code, say, take a look at the notebook and you’ll see something like this (after imports):
latent_dim = 9
TAXICAB = ft.partial(np.linalg.norm, axis=0, ord=1)
EUCLIDEAN = ft.partial(np.linalg.norm, axis=0, ord=2)
INF = ft.partial(np.linalg.norm, axis=0, ord=np.inf)
norms = []
c = Config( net_size = 20
, num_dense = 5
, latent_dim = latent_dim
, colours = 3
, input_size = 1 + 1 + len(norms) + latent_dim
, norms = norms
, activation_function = tf.nn.tanh
)
size = 512
width = size
height = size
m = build_model(c)
z = np.random.normal(0, 1, size=c.latent_dim)
sess.run(tf.global_variables_initializer())
yss = forward(sess, c, m, z, width, height)
ys = stitch_together(yss)
The magic here is that we can get quite different pictures by mucking around with the params: net_size
, num_dense
, norms
, activation_function
and basically just about anything!
The very simplistic idea I had was that we can generate images with nice smooth colours, then just map those colours to heights, and that’s the end of it! I did this in three.js and TensorFlow.js at first, with some terrible code:
It worked! You can also play with this live if you wish; it does a kind of cool animation, albeit kinda slowly.
Of course, what I really wanted was to get a feel for how “walkable” or “playable” the resulting map would be. So I found my way to Unity3D, and halfwrote halfgoogled a tiny script to load in the image as a height map:
using System.IO;
using UnityEngine;
public class TerrainHeight : MonoBehaviour {
public int height = 400;
public int width = 400;
public int depth = 200;
public string cppnImage = "/home/noon/dev/cppn3d/python/multi2.png";
void Start () {
Terrain terrain = GetComponent<Terrain>();
terrain.terrainData = GenerateTerrain(terrain.terrainData);
}
TerrainData GenerateTerrain (TerrainData data) {
data.size = new Vector3(width, depth, height);
data.SetHeights(0, 0, GenerateHeights());
return data;
}
public static Texture2D LoadPng (string filePath) {
byte[] data = File.ReadAllBytes(filePath);
Texture2D texture = new Texture2D(2, 2);
texture.LoadImage(data);
return texture;
}
float[,] GenerateHeights () {
float[,] heights = new float[width, height];
Texture2D image = LoadPng(cppnImage);
for (int x = 0; x < width; x++) {
for (int y = 0; y < height; y++) {
Color colour = image.GetPixel(x, y);
float height = colour.r
+ colour.g
+ colour.b;
heights[x, y] = height / 3;
}
}
return heights;
}
}
In Unity3D, you attach this script to a terrain, then when you run it, it will set that piece of terrain to have the given heights you want!
Looks alright! Obviously my general Unity skills need work, but at least it looks something like a landscape! Here’s a few more of the top view generate by a bunch of similarly produced images:
The images that generated these (not in order) are in the maps folder:
Anyway, I hope someone finds this useful! I hope I can play with this idea a bit more! I think there’s a lot of juice to squeeze here, in terms of using CPPNs to generate different levels of detail; to add much more detail to the Unity terrain by making decisions based on height (such as where water goes, where snow starts, etc). Furthermore, it would also be neat to autogenerate town locations, and just about everything! Then of course there’s all the details of the CPPN itself to play with; the layer structure, adding more variables, using different norms to highlight different regions of the resulting image; the mind boggles at the options!
I hope this demonstrates how fun CPPNs can be!
As an aside, early in the day I was experimenting with producing large tiled images.
The basic idea is conveyed here:
On the left I have a particular image that I’ve generated. I want to continue this image downwards by one tile. On the right is the same image with the next tile.
This idea was due to Gala (who works for Neighbourlytics): basically, given that we have the optimisation machinery at hand, why not just attempt to find a new image, from the network, whose border matches at the point we’re interested in.
Initially, my idea was that I could do this by optimising over the z vector $\vec{z}$ only; i.e. leave all the other parameters of the network alone. This turned out not to work at all. I’m actually not quite sure why, because my experience with CPPNs is that if $\vec{z}$ is large, then you can get a whole bunch of variation by modifying it. In any case, I tried it, and while it did manage to make some progress, it was never really particularly good.
When that approach didn’t work, I used the one that generated the tile connections from above: I just optimised with respect to the entire CPPN network.
There were a few problems with this approach, unfortunately:
The tiles it generated were less “interesting”: In the image above, the one of the left is made of 3 tiles. The top one is the starting one; note it’s complexity. The following two tiles are very low in interestingness, but the final one is actually not bad. This perhaps makes sense, as when the optimiser only has to match one colour, it can allow itself some richness in the other region.
It didn’t work when I tried to match up two boundaries:
In all these pictures, the bottomright tile is very out of sync with it’s two neighbours. This could definitely be fixed “in post”, by simplying blending it, but it’s still slightly unsatisfying that I couldn’t solve this within the CPPN framework. One original idea I had was to solve it by using (somethinglike) the interpolation process you see in the live JS example. Namely, we can pick two vectors $\vec{z_1}$ and $\vec{z_2}$ and move smoothly between them. When you watch this animate, you can feel like there should be some smoothing operation that would let us draw out a long line in this fashion. I think the approach would be to take, slicebyslice, new images from vectors $\vec{z_{n+1}}$, and use the slices from them to produce a landscape. This feels slightly odd to me, but perhaps would be nice.
In the end, my realisation was that I can produce very large maps simply by increasing the richness in the CPPN: increasing the numbers of dense layers, and “net size” (units in the dense layers), and then just simply making a highresolution version of the resulting image:
In many ways I think I’m still a bit unsatisfied by this approach. I think ultimately it would be nice to have a gridlayout map:
Where each block is controlled by some vector $\vec{z_i}$, and those can be modified at will. This would definately be possible just by blending in some standard way between the particular $\vec{z_i}$values, but I do still think there should be a CPPNbased solution. One idea Lyndon had was by directly constructing the image from the grid, and then encoding that back into the CPPN, then decoding it, to get the “closest match” that is still smooth between the borders. I think this might work, but here we don’t have an encoding network.
If you have any ideas along these lines, or find any of this useful, then I’d love to hear from you!
I found a cool plugin in Unity — The Terrain Toolkit — that lets me easily add textures, and I worked out how to add a water plane (you just find it in the standard assets, and drag on the “Prefab”, and resize it), so we can give the maps a more earthly look and feel:
So cool! (I also updated the code so you can more easily express richer layers in the CPPN, check out the Jupyter Notebook Generate Maps for more deets.)
So this year I’ve finally started my longdreamtabout fashion label where each design is produced by some kind of computer program: Noon on PAOM.
For the time being I’m playing around with the website Print All Over Me. I like it because you can, surprisingly, put your designs all over the clothes!
So far I’ve made two main things,
Links to buy:
This is the first thing I’ve made in this way, and it’s interesting to me because it combines many things I’m interested in.
It’s built using deep learning, Haskell and dance. Specifically, I used what’s referred to as a “Pose Network”, to watch a dance video, and infer from that video the poses that the dancer was in at the time. From there, I used a small Haskell program to take those poses, and lay them out in a colourful way.
These were inspired by some 80sstyle retro imagery that I found one day. I also put in a bit of thought into how to get the graphics displayed in a nicelyrandomised way, and via my friend Reuben came up with a scheme that I wrote about on the Silverpond blog: LowDiscrepency Sequences, Haskell, and TShirts!
In the end I decided to make a whole bunch of different items available, so hopefully there is something for everyone here. If there’s something on the PAOM catalogue that I haven’t created, or you’d like a custom colour scheme, then get in touch and I can make something for you! Also, if you do end up buying something, then send me a photo, or tag it with #retrohaskell
or #nvds
. I’d love to see how it looks and any crazy colour combinations!
Below I’ve enumerated all the items in the store, so you can click on the thing you like to buy it if you wish! Hopefully you’ll be seeing more open source clothing from me in the future :)
Evolution of Dance
Black Tako 
Evolution of Dance
Classic 
Retro Haskell
Black Tako 
Retro Haskell
Classic 
Retro Haskell
Aquamarine Fiesta 
Retro Haskell
Black Inka 
Retro Haskell
Purple At The Beach 
Retro Haskell
Purple Fiesta 
Retro Haskell
Pink Carnaval 
Retro Haskell
Christmas 


















































































































































































































