F24_SharedMinds — Steven Jos Phan

IMA/ITP Documentation | Fall 2024

01. Understanding Networks
02. Critical Experiences
03. Shared Minds
04. Connections Lab

Shared Minds | Professor Dan O’Sullivan

Syllabus

09 September 2024

What Were You Thinking

Assignment # 1

WRITING
How do you think? Do you decide to think your thoughts or do they just occur to you. Are you generally aware that you are thinking or are you lost in thought? What is your everyday experience like, do you feel like you are aware and in control of your thoughts? Are your thoughts delivered as voices. Do your thoughts repeat? Are your thoughts mostly positive or negative. Do you think one thought at a time. What makes one thought follow another. Are subsequent thoughts usually temporally or spatially related? Across how many dimensions can a thought be related to others. Are you the same person over time? How can you better get to know how you think?

Do some research about illusions that you live with. What is a an impenetrable illusion? Where do they come from and what value do these illusions have? Can technology overcome illusions?

I don’t think I was very well tuned in to my thought processes before I began practicing meditation. I was woefully unaware of how thoughts were manifested, how I processed them in the moment or what it meant to clear my mind. Despite having a marginally better understanding of my own thought processes, I concede that I’m still far from truly understanding how my awareness works. That marginal improvement, however, has helped me dramatically in setting my mind to quiet when I need it.

Today, despite my best efforts at practicing mindfulness, I still feel as though my conscious ability to conjure thoughts is lacking. My thoughts simply occur and I am not in control of how they bubble up. Taking a step back, however, I recognize that I have both conscious and subconscious trains of thought that operate in different ways. I believe my subconscious is running in overdrive to control which thoughts bubble up and thereby taking a lot of load off of my conscious mind.

Practically speaking, I believe my subconscious is adept at suppressing thoughts which have little bearing on me in the present or near future. This means that with little effort, I can keep things that I have little control over out of my primary cognition. It allows me the space to focus on what’s most pressing for me without the distraction of things that have little bearing on me or that I know to only be worth dwelling on when the time is right.

At times, this mode of processing feels almost too practical and means that I occasionally have trouble relating to other people who don’t function in this way. People that get in their own head and have trouble taking the next, most logical step forward don’t feel relatable to me. I’ve long known that I have a remarkable ability to suppress thoughts that are traumatic or troubling - sometimes at great mystery or distress to those around me.

At the end of the day, this gives me a great sense of control in being able to steer my brain but I sometimes feel as though I’m not entirely the one responsible for this power. I don’t know why my thoughts are mostly positive. I can’t put my finger on why my subconscious only surfaces one thought at a time. I think the foreground conscious processing has everything to do with what my subconscious surfaces next in my train of thought. It feels wrong to say that my conscious mind reached in and grabbed the thought that it wanted next.

When I practice meditation, I get the sense that my thoughts are arranged in a pseudo-spatial way but I highly doubt that this is correlated to the way that my memories are actually stored. I recently listened to a mind-expanding podcast where Lex Fridman interviewed the German AI researcher and cognitive scientist Joscha Bach. His work is focused on cognitive architectures, mental representation, emotion, social modeling, among other things. I was simply blown away by their conversation. Bach believes that “we don't exist in the physical world. We do exist inside of a story that the brain tells itself.” Researching cognitive architectures built by the brain led me to a research paper titled The Interface Theory of Perception by Hoffman, Singh & Prakash. In it, they build a simple, yet powerful analogy in describing a computer’s digital interface concealing transistors and firmware in favor of colorful screen-based icons and texts – a symbolic representation for ease of use.

“Just as the color and shape of an icon for a text file do not entail that the text file itself has a color or shape, so also our perceptions of space-time and objects do not entail (by the Invention of Space-Time Theorem) that objective reality has the structure of space-time and objects. An interface serves to guide useful actions, not to resemble truth…For the perceptions of H. sapiens, space-time is the desktop and physical objects are the icons. Our perceptions of space-time and objects have been shaped by natural selection to hide the truth and guide adaptive behaviors. Perception is an adaptive interface.”

As a designer, is it fair to assume that the pseudo-spatial organization of my conscious thoughts is an adaptive interface designed to suit my organizational prerogatives? If so, this is quite the illusion.

I’m a sucker for the type of drama where illusions are cloaked in a performance and designed to simultaneously distract, engage and disentangle you from your own life. Whether a momentary experience or an hours-long foray, that sense of immersion can be very powerful on the psyche and in memory-making. The illusions used in themed entertainment (e.g. Disney’s Galactic Starcruiser) or immersive theater (e.g. Sleep No More) dismantle one’s understanding of what is real and often ascribe to the observer/participant a new implied identity, temporarily, in the name of art and entertainment.

It’s no secret that illusions are a fantastically effective tool for altering mindsets. Impenetrable illusions are ones that are so convincingly deceptive that it becomes impossible to distinguish it from reality. One might argue that organized religions are impenetrable illusions that shape the way huge portions of the population understand and relate to the world we live in. Getting even more conceptual and philosophical, I look back to my undergraduate study in Philosophy and recall many discussions around the nature of reality and perception that questioned whether what we perceive as reality might instead be an illusion. It’s these existential questions that drew me into philosophy in the first place and it’s this central debate that keeps me coming back.

I find it fascinating that researchers like Bach and others in his field of study who have dedicated themselves to the study of artificial intelligence are simultaneously forced to grapple with questions around reality, perception and consciousness. I think it’s entirely possible that technology and science can lead us to understand the answers to some of our most consequential questions but in doing so, it’s entirely possible that we may be opening Pandora’s Box.

MAKING

For the making component of the assignment, I tweaked the Stream of Consciousness js file pulled from the shared class repository. Before I was able to get going, I spent some time getting my IDE – VS Code – dialed in with some bells and whistles. Live Server and Copilot were easy to implement but getting Github connected was much more tricky. I was able to get my Github account connected within VSCode fairly easily. Pulling a repository didn’t pose any issues but I’m not sure that was hinged on being logged in to Github. The main challenge I faced was committing to Github from within VS Code. I’m confident I’ll be able to crack it after I step away for a night but for now, I’ve settled for manually uploading code into Github. It does the trick for now.

In playing with the Stream of Consciousness js file, I challenged myself to see how well I could prompt Copilot to tweak the existing js file in a meaningful way. I had several idas that I cycled through to see how well Copilot would deliver before landing on something that felt interesting. I’ll try to describe some of the animations I was attempting to build with Copilot. Caveat: these are not my actual Copilot prompts but are instead distillations for succinctness

1. Rather than each new input string drawing to a fresh new place on the canvas, create a continuous string of characters where the new input string is concatenated to the end of the continuous string. Treat each character like a node that is freely moving but retain the linkage between adjacent nodes. Animate the continuous string around the canvas to mimic the slithering of a snake
2. Create an invisible circle that is 30% of the width of the viewport. Around that circle emanating out from the center like a sun’s rays draw each new word such that every word is an angled ray originating around the circle as an axis.

The result that I ended up enjoying is an animation the reminds me of what a stream of consciousness must feel like for someone suffering from mania. After each new word is submitted, the individual characters explode outward from the origin. As more words are added, the scene becomes a chaotic, yet beautiful arrangement of letters that, although without any present meaning, once related the active thoughts of the participant. In some way, I could see this code being evolved to recombinate the words once again. I think it’d be beautiful.

Here is the live interactive.

Here is the Github repo for the project.

18 September 2024

Dimensions

Assignment # 2

WRITING
Is consciousness the primary thing? Does the hallucination start when you open your eyes? Can you make an interface the inner experience that you have when you close your eyes that you wrote about last week. Should interfaces mimic that you see in front of you when you open your eyes with a virtual reality headset or should we aim at something more fundamental? Are your lines of thinking typically along dimensions of time or space? In other word , is the content of each subsequent thought related to the previous based on location or chronology? Do your thoughts have much do do with location or chronology at all? Will spatial interfaces like VR interfaces so firmly anchored in continuous 3D space with continuously flowing time serve to depict your thoughts.

Hoffman says time and space are just simplifying interface (ITP Interface Theory of Perception) for a more complicated reality. Should we make our interfaces address that more complicated reality directly instead of making and interface to an interface?

Can we say that consciousness is the primary thing or is it as Hoffman describes, a simplified representation of something “else” that our senses manifest to serve us? There are some fantastic ocular experiments that anyone can perform that point to the fact that our visual experience of the world is not entirely as we understand it to be. These simple experiments point to the fact that our eyes manifest a great deal of the visual image we see through our eyes through fascinating ocular adaptations. Could it be such that the mental/visual trickery goes far deeper than we have the ability to understand. It’s possible.

What about the images we see when our eyes are closed? To what can we attribute these visual hallucinations? I wish I could put forth a strong point of view on this topic and I’m deeply curious if a newly born child with no visual experience with this world will have a mental image with spatial and visual cues. Is it possible to have visual representations in the mind with no personal experience to reflect on? This would be a fascinating topic to research. It’s possible that EEG scans might be a meaningful starting point.

I found Hoffman’s storytelling and metaphors in support of the Interface Theory of Perception to be compelling, at least at first. His metaphors, specifically his example comparing our interpretation of the world as no more than a complex framework of icons not unlike the icons on a computer’s desktop, feel both eye-opening, mind-expanding and a bit flashy. I listened to several of Hoffman’s talks and interviews and I will admit that at first I was all-in. Hoffman excited his interviewers (I listened to both Lex Fridman and Sam Harris’ interviews) to such a degree that I felt there had to be something here. The more I unpacked his theory and tried to grapple with his position, however, the more I felt like it was missing some critical underpinnings. He references his complex computer modeling that consistently generates results that support his theory but can it be possible at present to model a world and its inhabitants to a fidelity that it realistically represents the world as we experience it? Can we model emotional intangibles that drive so many of our actions and beliefs? Life is infinitely complex and I’d like to know more about the parameters of Hoffman’s model.

I concede that I should continue to learn more about Hoffman’s thesis and read his source documentation before I discount any of it, and at present, I feel inspired by the possibilities of his argument but remain a healthy skeptic.

An amazing amount of research is currently being poured into virtual and mixed reality interfaces and some tout spatial computing as the way we’ll liaise with technology in the near future. Perception, experience and time are some of the key elements up for discussion when it comes to unpacking the potential of these products. For what it's worth, I believe that the future of virtual reality will lead us predominantly towards mixed reality. Hardware is currently the limiting factor in achieving what is being dreamt for mixed reality experiences, or ones that overlay simulated imagery atop the primary experience of the world. I think this is the most logical next step and will help product designers to overcome some of the uncanny feelings that come along with a fully virtual experience - namely how the illusion breaks when it comes to locomotion and our lack of motor skills reflective of what we expect to see and feel.

Perhaps, though, we shouldn’t be trying to mirror our literal experience in the world through VR. It’s hard to know what we should be targeting, however, without some breakthroughs in understanding what new primitives are more fundamental than space-time. Without accessing what is deeper, or as Hoffman puts it, circumventing the pointers in our spatial cognition, I believe we’ll struggle with knowing what to model. Furthermore, what if the visual experience of this world tarnishes our potential to see anything differently? To use a reference from pop culture, the character Neo from The Matrix struggled with reframing his perspective after learning the true nature of reality. It’s reasonable to suggest that only a tabula rasa human being could be capable of developing a different mental model of reality. Would scientific experiments need to start with a human from the moment of birth or even before?

MAKING
Use ML to make conjuring rather than traveling the important verb of your application. Can you give an interface for people to navigate the hyper dimensional space of a Machine Learning generative models. For this week you might try just learn to use the fetch command with the APIs to explore various models in a service like Replicate.

My interface is an extension from Dan O’Sullivan’s initial Stream of Consciousness interface that our class explored in week one. Our prompt was to build an interface capable of navigating the hyper-dimensional space of a Machine Learning generative model.

My concept centered around the idea that both neural networks, natural thinking processes and idea generation are infinitely deep and broad. Thoughts, though experienced in a pseudo-linear way, can manifest from our subconscious into our conscious mind in any number of wild new directions or just as simply follow a highly correlative train of thought.

The interface that I built attempted to create two hierarchies of neural possibilities that branch off from a central idea. The first strata contains four words that are highly correlated to the central idea. The second strata contains eight words that are creatively associated or analogously related to the central word. Obviously neural networks and our minds have far more strata and complexity that this simple representation. At any rate, my interface looks like this:

Admittedly, I’m using the same API call as Dan O’Sullivan’s example because I was impressed with Meta’s Llama model and I wanted to get an interface MVP up and running first and foremost. I’m calling it twice in my project recognizing that I have two distinct fetch queries to ensure I have return values that can populate the two strata of idea nodes. My two queries are as follows:

prompt = "a json list of " + ideaNodes + " words related to " + word;

prompt = "a json list of " + ideaNodes + " creative words that are analagously related to " + word;

Currently, the interface is wildly slow but once it gets “warmed up” it seems to run much more quickly. At first, it can take 60 seconds or more to return and process data for new nodes. I’d like to dig into why this is the case. When warmed up, I can click new values repeatedly and get responses rather quickly.

Other issues that I’m noticing are related to how raw responses are returned from the model. Consistency is not guaranteed in the current design and occasionally some of my nodes do not populate due to the raw response coming back in an unpredictable format. I think I could improve my parameters and adjust my prompts to ensure that the response is more consistent. That being said, when experimenting with the Meta Llama model directly in Replicate, even extremely specific prompts and parameters (e.g. “return a json file with a maximum of four words”) will still return an unpredictable raw response. Text at the beginnning of the response (“Yes, here are some ideas related to *word*....”) and after (“These words all share a similar connotation of…”) is challenging to parse. Occasionally, the individual json values are not succinct and take the form of a long-form paragraph. I’m sure I can improve how I parsing the raw statements to clean up my results but in order to mitigate these issues, I think improved prompting or a new model will be essential.

The last issue that I’m encountering seems like it has something to do with how I’m sending out fetch requests or maybe in how I’m parsing and storing return values. The issue manifests after hitting enter and as the nodes are populating. Some, not all, of the nodes appear to flicker through several idea/thoughts before landing on a final string. As of now, I’m not sure why this is the case.

Overall, I’m pretty happy with my first attempt at an API call and integration with an LLM!

Notes: Copilot and ChatGPT were instrumental in streamlining this project.

Here is the live interactive.

Here is the Github repo for the project.

Instagram

Linkedin

Imagine the piece as a set of disconnected events