Txt to Image Introduction

The primary purpose of this series of pages is not to; from my end,  facilitate teaching, but to put the reader in a position to be able to answer questions about the role of Artificial Intelligence, specifically in it’s relation to txt to image software, in an informed manner.

This is a landscape riddled with misinformation via poor journalistic practice, rumour, popular mythology and something akin to the telling of contemporary fairy tales. The proof of this is simple and can be knocked out in a paragraph or two, but before doing so let me get to the root of the problem in the first place and this is simply that the communal understanding of how digital images are made, stored and re created is either non existent or pretty fragmented.

Unlike photographs which are physical objects that can be held in the hands and are the product of the interaction of light with the silver halides that are contained in the emulsion of a roll of film and sheets of photographic paper that have been processed with the relevant chemicals in their respective spaces; a digital image is first and foremost simply a dataset. You can’t hold it, you can’t see it and the ownership of it is a disputed matter. However the minute you use that dataset to print or display an image that does not belong to you; you enter another space and the relevant conventions and protocols have to be adhered to.

Neural networks are trained on datasets not images. No image is used in the making of AI images. That may be hard to conceive of as being true but as the dog says to the pig in ‘Babe’ “that’s just the way things are” How datasets are obtained is another issue, but there appears to be no breach of any relevant conventions in this process. By way of example, I have work on a lot of online galleries, on social media etc. The minute I upload a file for an image to Facebook, Instagram etc the data in the file is reconfigured so as to conform with the particular sites web protocols. The original data is stored and the reconfigured data is used to generate the image seen on the site. I have no claim to the reconfigured data (the colour profile, resolution etc will have been changed) and consequently no claim to the generated image and as many of you may well know Meta owns everything you post.

There are four types of neural networks associated with txt to image generation;

  • Diffusion Networks
  • Generative Adversarial Networks
  • Convolution Neural Networks
  • Diverse Diffusion Networks (this is a new model and won’t be discussed at the moment)

Each have their own distinct advantages, disadvantages and applications. Getting to know how any of these work will quickly dispel any myths that persist in this area. When these are understood you will know that it’s impossible to commit any kind of plagiarism or forgery in the making of txt to image works. However, since these images are generated without human agency there are issues around artist / designer copyright, meaning you can’t under current law copyright an image that has employed a neural network to generate it. However once the artist continues to refine the image and the subsequent dataset via additional txt prompts, outpainting, refining the image in 3rd party software etc,. human agency is indeed at work.


If you get through that little reading, the worst is over. Be sure to drop back; its possibly going to get interesting. 😊

The links below point to web documents that were first posted on a number of teacher networks on Facebook. However these have now been converted to the new Adobe Express platform. They covered a range of preliminary explorations into AI and may reflect a different understanding to that presented here. Contained in them are also examples of things that might be able to be done with students and previous works created by them.