Modelling imagination – Part I

NOTE: Technical blog, I will use this and the following parts as a journal for ideas on modelling imagination on a computer and is mainly intended as a log for myself for later use and inspiration for others who would like to try.

Hypothesis: To teach understanding to a machine it is inevitable to give it some instrument to do “Gedankenexperiments”, that means represent knowledge in a visual way and be able to operate on the imagined objects for testing hypotheses.

Leading example: An apple on a table.

The initial task is to tell the machine what an apple and what an apple is. Furthermore physical laws have to be provided which can be used for manipulating the objects. Also there exist places where objects are not pulled down, like in an orbiting space station, it is an acceptable assumption to start with that all object with a density higher than air is pulled downwards.

When I imagine an apple on a table I think of a sphere on a box of size 2 x 1 x 0.05m with four boxes of size 1 x 0.1 x 0.1m about 10 cm from the 4 corners of the first box. These 4 boxes are attached to an plane which fades out a few meters away from the objects.

Now what happens if the table is removed?
I imagine the sphere falling towards the ground, slightly bouncing and come to an rest.
A real apple would start to rotate while falling and get some spots when hitting the ground, but when imagining the situation, the apple does not have a stem our other structures, but is an ideal, slightly elastic, sphere. To get a fast estimate of what will happen in the situation it is not important that an apple is not perfectly sphere and it doesn’t matter if it has a stem or not. In other situation this may be different. So modelling imagination should be able to do the following:

– Build a model of the object
– Select important features for the specific situation
– Manipulate with the objects in a simplified way.

Build a model of the object
Some properties which can be assigned to all physical objects:
– Base-form (box, sphere, disc, rectangle, point)
– Color/structure
– Hardness
– Weight

All models should start with one and only one base form. This base form can then be extended to build a detailed picture of an object. In the example with the apple and the table this would be a sphere for the apple and a box for the table (don’t mind the legs). This would already be enough to imagine what would happen to an apple when the table is removed, since the table legs don’t interact with the apple.
Another example: Imagine how to cut an apple. Here it matters that the apple has a stem and has a core (which we assume we don’t want to eat). This leads us to:

Selecting important features
When I imagine situations, especially complicated situations with many objects, I don’t care with the details of the object and what implication these have on the situation. Imagine a football hitting a window. I don’t care about the footballs structure and that it has a vent and I don’t care about the house which the window is a part, actually not even the thickness of the glass is important nor the window frame. The only thing that matters is that the football hits the window and the window splinters. Splintering glass is complicated so I imagine the result as a plate with a football-sized hole. So how to select what is important?

Manipulating objects
How got the hole into the glass in the example above? I imagine the ball going in a straight line and from experience I know that the glass is (by assumption) the weaker object and I know that glass splinters locally when hit. Hence the ball removes the part of the glass which is passed by the ball in it’s linear motion.
The knowledge about glass being fragile has to be provided by experience. I guess that a child would not be able to tell what would happen to the window if it had never experienced splintering glass and would have to ask it’s parents or try to hit the window with the ball.


To represent objects I imagine a framework which works similar to Multi Resolution Analysis (MRA) for wavelets or Regression Trees with binary splits, where each split models the underlying function more precisely. As said, objects should have a single base form and by selecting higher levels, this base form is replaced by more complex combinations of several base form objects. Just imagine the table from the first example, which initially is modelled ad a box and can be resolved to be a thin box with 4 other boxes attached. The way to store these models could be by having a small collection of base forms and and defining the properties of the object in terms of a single base form and several additions. For example the apple could be modelled as
– a sphere with radius 8cm
– Red / smooth surface
– Slightly elastic
– 200g

Additions could be:
– a cylinder, radius 1mm, height 3cm
– Brown / smooth surface
– Elastic
– 5g
– Attached: On top of the sphere

– a cylinder, radius 2cm, height (2*radius of sphere)
– Light yellow-green / rough surface
– Slightly elastic
– 100g
– Replaces: Centre of the sphere

The additions are hence defined as “addition, replacement or exclusion” to the base structure. All these additions have again additions themselves, such that an object is build from a hierarchy of additions.

Next part: Feature selection

Comments are closed.