Epistemic Evaluation on the Transformer

Scope

The knowledge that can be provided by a transformer when trained and output is received from a prompt depends on two things:

First, the prompt provided.

Second, the specific type of material which the model is trained on (i.e. for GPT-4 the corpus of the internet, but a model might be just trained on recipes [or fine-tuned]).

The people who we are concerned with are daily user of ChatGPT, data engineers and scientiest who train other transformer models solving numerous tasks.

Epistemic Values (Goldman’s Objectives)

Power

Discussion on the capability for a trained model to pass on information.

Examples using ChatGPT and the information it provides and the amount of those which can be provided. Contrast this against models which could be trained on false information or anything else.

Reliability

First take a look at ChatCPT and discuss how often it gets information incorrect. Move the the underlying architexture and discuss how the model can or cannot work depending on the information passed to it.

The model depends on the reliability of the underlying data. Data which is not reliable will lead to a model whichh is unreliable and vice versa.

Discuss the possibility that reliable data is synthesized into unreliable results.

Speed

Discuss the two major components of the speed. Speed required to train a model (very slow) and speed required to extract information from the model (very fast).

Over a long period of time the second begins the outweigh the first, but if a model was unreliable and needs to be retrained, then the time cost massively increases.

Discuss fine-tuning and it’s potential benefits and consequences to speed (e.g. what if GPT-4 was really bad [in an unknown way] and every model trained on it suddenly needed to be retrained).

Fecundity

Discuss two components: creating a model on your own (cost of training a model is high), using an existing model (ChatGPT is accessible, but are other models easy to use?). Necessity for computers.

Efficiency

Discuss the cost of training a model like ChatGPT versus a simple model. Look at the consequences in power because of this.

Furthermore move on to Fine-tuning and how it can use models and train them for a cheaper cost.

Alternatives (and Comparisons)

Google Search

Google Search requires exact knowledge about the knowledge that wants to obtained, transformers require extensive knowledge (corpus) about the topic, but can be used generally with less extensive knowledge post-train.

Wikipedia

Wikipedia provides lots of crowd-sourced information that is generally accurate, but is often non-condensed and, for certain topics, at expert level (therefore reliable, but unaccessable - less fecundity).

Other Models

Long-Short Term Memory -> generally similar in principle to the transformer, less complex architecture (more accessible towards less skilled users), much more time to train (therefore less efficient overall).

Diffusion models -> typically used to generated images which can have epistemic consequence based on the content created and the logic behind it. Different subject and overall usecase than transformers.

Goodness of the Transformer

Potential risk of displacing people in many industries related towards writing, summarizing, or creating information.

Large benefit of being easy to use by consumers.

Strengths and Weakness of Goodman’s Objectives

Goodman’s objectives are dicrete, but what they are measuring largely is not. Things that affect one can affect many others (or necessarily affect many others), and things may not affect any, but are still important factors to consider.