Billy Ho


Billy Ho(he/him) is a Creative Technologist and Designer who works at the intersection of Art, Research, and Commercial Technology. His works aim to evoke Introspection and Empathy for both humans and machines.

Billy holds a BFA in Industrial Design from Shih Chien University and an MFA Design and technology from Parsons School of Design. Click here︎︎︎ to view Résumé.



Making of: Are you there computer? It’s us, humans

 2022

From the very first conception to code and installation, this is the on-going master documentation of the making of : Are you there computer? It’s us, humans.

Conception



- Machine that “gets it”



I am creating an interactive experience to explore text-generating machine learning systems’ influence on modern human audiences.

The modern natural language processing technologies have been largely embodied as a goal-oriented tool to streamline users’ intent to corporates’ reactions, deployed on a large scale to maximize efficiency. For example, numbers of chatbots are being deployed to replace customer calling centers, and different models of sentiment analysis are being used as product reviews aggregations, down to our daily interaction with our smart home assistants and email auto-completes.

From all the previously mentioned domains of language models, one can identify a coherent reaction when the general public first saw those breakthrough developments, an oscillation between sense of wonder and unsettlement on how closely the generated sentences resemble ours. It felt like something had been breached, somewhere between humans and the tools they had crafted and used.

The question I am trying to answer is the influences these “human-like” tools have on us, and the fundamental understandings of our identities, by providing a physical space to invite audiences to interact with one of the instances as a collaboration between myself and them, to amplify and translate the generated contents poetically as physical anamorphic letters coming out of the surfaces. My main interest and focus is the empathy implied in these machine-generated contents and the human reactions to them.

I have chosen text as the content generated by my collaboration with machines. For the following reasons:

  1. Compared to images, moving pictures, and visuals in general, the text has a direct lineage to languages, which, as far as we know, is an information storing and communication systems constructed by human beings. To which text became inseparable from human civilization.
  2. The current capabilities of language-based machine learning provide more controlled environments for the generated contents compared to other mediums.

For these reasons, I have decided to work with Openai’s Natural Language Model(NLP) gpt-3, for its large number of parameters from training materials, open access, and a wide range of supported coding environments. I have fine-tuned the model with texts of a collection of people’s dreams, meaning to train the model to generate texts according to the style of the fine-tuning materials. On the other hand, I plan to render the generated text on a vintage 90s computer monitor along with an immersive office setting, which I will further elaborate in the later section.

My thesis situates modern machine learning systems as intelligence technologies and examines the recursive influences between humans and those technologies. Leveraging texts as a shared property between these two parties. Throughout this I presented a series of experiments regarding my interaction/collaboration with the Openai’s gpt-3, and different ways and mediums I have chosen to represent the process, followed by feedback assessments and next iterations.

My interest in language-based machine learning came from the promise of introspection, through this technology that is ambiguously close to us yet external, the promise of a quantifiable understanding of empathy and human emotions. Because of the nature of my subject, throughout the testing sessions, I found my audiences are often intrigued at the start. Carried by the current cultural environment, the majority of the audience was able to find my angle of the domain compelling.

My aim for the project is to provide an accessible framework to foster discourses and reflections on the influence of ubiquitous intelligent systems outside of goal-oriented implementations. Centered around Openai’s gpt-3 and highlights potentials in two directions: The flexibility of fine-tuning and the appearance of an empathetic machine, and treating the generated texts in ways that implies an external origin. 

Data-Processing | Python Prompt to Completion | gpt-3



The final installation centeres around an interactive chat bot running on a node.js based webapp, accessing a set of fine-tuned machine learning apis hosted by OpenAI.


A natural language processing neural network (NLP) is a specific subset of machine learning designated to work with human languages; on a very high level, one of its very essences is to predict outcomes (in the form of text) from a dataset of given prompts (also in the form of text). Modern NLPs like OpenAI’s gpt-3 is a model already trained on massive real-world human dialogs, which they had published as public apis, meaning as a publicly access tool set for people to further ‘prime’ the model to generate their desired outcomes. 


Fine-tune a model


As stated above, the model is already trained with a large amount of data, for the end user like us, we can access the models last layer and input our own custom datasets to perform a “transformation” of that layer, meaning the generated contented will likely to come in the style of our curated datasets. This technique is called Fine-tuning.



Data collection and processing


The first step consists of decisions on what to select as the fine-tuning text, then refroming them in to two different groups. To demonstrate, I will use part of my thesis paper as the fine-tune material.



The material is stored in a .txt file called pa01.txt. To send fine-tune request to gpt-3, we need to somehow process this file into prompts and completions. OpenAI’s official documentation requires a specific file format known as JSONL, and the content needs to be in the form of the following:


{"prompt": "<prompt text>", "completion": "<ideal generated text>"}

{"prompt": "<prompt text>", "completion": "<ideal generated text>"}

{"prompt": "<prompt text>", "completion": "<ideal generated text>"}

...


To start, I wrote a python script to first process the original texts in to a csv file with two columns: [ Prompt, Completion ]





In this example, I splitted the material with the character “.” Next to export the file as csv, and we can use one of OpenAI’s api in terminal to transform our csv file into JSONL.


Once we successfully created our JSONL fine-tuining file, we once again use OpenAI’s api to execute the fine-tuning, in this example I am using mac os’s terminal to make the request.



When finished, we will receive a modelID, and we’re good to go!


Hosting with node.js and further priming to foster a conversational interaction


Giving that I am working towards an immersive webapp, OpenAI gratefully created supports for node.js, allowing us to make requests in javascript. With official documentation in the following link:
https://beta.openai.com/docs/quickstart︎︎︎


Now we have a completion model, meaning that the model will complete whatever our users have put in the request, but we need to complete one more step for it to generate sensible conversations.




In the image above, I created a JSON file containing the final “Priming prompt” to help situate my already tunned model in a two-party conversational setting.


These texts will not be visible to the users, but will still be put in to the prompt before users’ inputs. And this is where the magic of working with gpt-3 comes in! Besides the “name”, I have separated the text into two categories: [ instruction, content ]


What goes on in the backend is basically a simple combination of these two groups of texts, and the gpt-3 will understand and even “sense” the tone of the conversation. The purpose for the “content” part is to give our model the idea of a conversation between two participants. So I assigned “A:” and “B:” as two different identities.




In my script, I simply replace the display with my desired names.


Error handling


Giving the uncertain nature of machine learning, early iterations of the prototype would occasionally suffer from a null response from the model, if the length of users’ input is too short. For this case, I created a “filler” sentence if the length is shorter than six, and automatically adds the following sentence: “Can you tell me more?”

Art Diection and world building

I’d like to start by acknowledging the most prominent image of a 90s office space along with its vintage computer monitor. This stems from my personal attempt to create a physical mood board out of contemporary media’s portrayal of early techno-optimism.

By situating an AI chatbot in a vintage setting, I intended to harness the positive and nostalgic connotations that come with 90s pop culture artifacts to ease the audience into the experience.

While Atopia’s (my speculative institute) mission is quite layered and complicated, my intention wasn’t for the audiences to get all of it, but to feel the security of a larger world out there that made this institute and the website they’re looking at. And with that, the pop culture images will again play into the theme and world-building.


The voice of a sad robot and where it’s from


As my thesis combines the existential human condition and machine learning, my goal is to outsource this labor into artistic settings. Not only as an intuitively expository attempt to the audiences but also for my personal exploration.



This stage started out with the diagram above, a high-level taxonomic breakdown of the elements involved within the realm of moods and zeitgeists. Heavily influenced by contemporary pop-culture’s collective nostalgia for the 90, I started from the bottom of the diagram and slowly works its way up to my theme, gathering feedbacks from my faculties and fellow students, on which vocabularies to include and which not to.
Updating... more to come.