Billy Ho


Billy Ho(he/him) is a Creative Technologist and Designer who works at the intersection of Art, Research, and Commercial Technology. His works aim to evoke Introspection and Empathy for both humans and machines.

Billy holds a BFA in Industrial Design from Shih Chien University and an MFA Design and technology from Parsons School of Design. Click here︎︎︎ to view Résumé.



ChooseImage

is a creative technologist/designer who works with technology to evoke introspection and empathy.

He is currently pursuing an MFA in Design and Technology at Parsons School of Design in NYC.





Making of: Are you there computer? It’s us, humans

 2022

From the very first conception to code and installation, this is the on-going master documentation of the making of : Are you there computer? It’s us, humans.

Conception



- Machine that “gets it”



I am creating an interactive experience to explore text-generating machine learning systems’ influence on modern human audiences.

The modern natural language processing technologies have been largely embodied as a goal-oriented tool to streamline users’ intent to corporates’ reactions, deployed on a large scale to maximize efficiency. For example, numbers of chatbots are being deployed to replace customer calling centers, and different models of sentiment analysis are being used as product reviews aggregations, down to our daily interaction with our smart home assistants and email auto-completes.

From all the previously mentioned domains of language models, one can identify a coherent reaction when the general public first saw those breakthrough developments, an oscillation between sense of wonder and unsettlement on how closely the generated sentences resemble ours. It felt like something had been breached, somewhere between humans and the tools they had crafted and used.

The question I am trying to answer is the influences these “human-like” tools have on us, and the fundamental understandings of our identities, by providing a physical space to invite audiences to interact with one of the instances as a collaboration between myself and them, to amplify and translate the generated contents poetically as physical anamorphic letters coming out of the surfaces. My main interest and focus is the empathy implied in these machine-generated contents and the human reactions to them.

I have chosen text as the content generated by my collaboration with machines. For the following reasons:

  1. Compared to images, moving pictures, and visuals in general, the text has a direct lineage to languages, which, as far as we know, is an information storing and communication systems constructed by human beings. To which text became inseparable from human civilization.
  2. The current capabilities of language-based machine learning provide more controlled environments for the generated contents compared to other mediums.

For these reasons, I have decided to work with Openai’s Natural Language Model(NLP) gpt-3, for its large number of parameters from training materials, open access, and a wide range of supported coding environments. I have fine-tuned the model with texts of a collection of people’s dreams, meaning to train the model to generate texts according to the style of the fine-tuning materials. On the other hand, I plan to render the generated text on a vintage 90s computer monitor along with an immersive office setting, which I will further elaborate in the later section.

My thesis situates modern machine learning systems as intelligence technologies and examines the recursive influences between humans and those technologies. Leveraging texts as a shared property between these two parties. Throughout this I presented a series of experiments regarding my interaction/collaboration with the Openai’s gpt-3, and different ways and mediums I have chosen to represent the process, followed by feedback assessments and next iterations.

My interest in language-based machine learning came from the promise of introspection, through this technology that is ambiguously close to us yet external, the promise of a quantifiable understanding of empathy and human emotions. Because of the nature of my subject, throughout the testing sessions, I found my audiences are often intrigued at the start. Carried by the current cultural environment, the majority of the audience was able to find my angle of the domain compelling.

My aim for the project is to provide an accessible framework to foster discourses and reflections on the influence of ubiquitous intelligent systems outside of goal-oriented implementations. Centered around Openai’s gpt-3 and highlights potentials in two directions: The flexibility of fine-tuning and the appearance of an empathetic machine, and treating the generated texts in ways that implies an external origin. 

Data-Processing | Python Prompt to Completion | gpt-3



The final installation centeres around an interactive chat bot running on a node.js based webapp, accessing a set of fine-tuned machine learning apis hosted by OpenAI.


A natural language processing neural network (NLP) is a specific subset of machine learning designated to work with human languages; on a very high level, one of its very essences is to predict outcomes (in the form of text) from a dataset of given prompts (also in the form of text). Modern NLPs like OpenAI’s gpt-3 is a model already trained on massive real-world human dialogs, which they had published as public apis, meaning as a publicly access tool set for people to further ‘prime’ the model to generate their desired outcomes. 


Fine-tune a model


As stated above, the model is already trained with a large amount of data, for the end user like us, we can access the models last layer and input our own custom datasets to perform a “transformation” of that layer, meaning the generated contented will likely to come in the style of our curated datasets. This technique is called Fine-tuning.



Data collection and processing


The first step consists of decisions on what to select as the fine-tuning text, then refroming them in to two different groups. To demonstrate, I will use part of my thesis paper as the fine-tune material.



The material is stored in a .txt file called pa01.txt. To send fine-tune request to gpt-3, we need to somehow process this file into prompts and completions. OpenAI’s official documentation requires a specific file format known as JSONL, and the content needs to be in the form of the following:


{"prompt": "<prompt text>", "completion": "<ideal generated text>"}

{"prompt": "<prompt text>", "completion": "<ideal generated text>"}

{"prompt": "<prompt text>", "completion": "<ideal generated text>"}

...


To start, I wrote a python script to first process the original texts in to a csv file with two columns: [ Prompt, Completion ]





In this example, I splitted the material with the character “.” Next to export the file as csv, and we can use one of OpenAI’s api in terminal to transform our csv file into JSONL.


Once we successfully created our JSONL fine-tuining file, we once again use OpenAI’s api to execute the fine-tuning, in this example I am using mac os’s terminal to make the request.



When finished, we will receive a modelID, and we’re good to go!


Hosting with node.js and further priming to foster a conversational interaction


Giving that I am working towards an immersive webapp, OpenAI gratefully created supports for node.js, allowing us to make requests in javascript. With official documentation in the following link:
https://beta.openai.com/docs/quickstart︎︎︎


Now we have a completion model, meaning that the model will complete whatever our users have put in the request, but we need to complete one more step for it to generate sensible conversations.




In the image above, I created a JSON file containing the final “Priming prompt” to help situate my already tunned model in a two-party conversational setting.


These texts will not be visible to the users, but will still be put in to the prompt before users’ inputs. And this is where the magic of working with gpt-3 comes in! Besides the “name”, I have separated the text into two categories: [ instruction, content ]


What goes on in the backend is basically a simple combination of these two groups of texts, and the gpt-3 will understand and even “sense” the tone of the conversation. The purpose for the “content” part is to give our model the idea of a conversation between two participants. So I assigned “A:” and “B:” as two different identities.




In my script, I simply replace the display with my desired names.


Error handling


Giving the uncertain nature of machine learning, early iterations of the prototype would occasionally suffer from a null response from the model, if the length of users’ input is too short. For this case, I created a “filler” sentence if the length is shorter than six, and automatically adds the following sentence: “Can you tell me more?”

Experimental prototype iterations, Narrative setting and a lot more comming soon!