Abstract

The formulation of a Swedish Cultural Canon by a committee of experts has sparked debate among cultural agents and the public: while some perceive it as controversial, others value its role in categorising and conserving key historical milestones of Swedish culture. “Encoding Culture” speculates that putting together cultural data as images and text for generative AI training may replicate the mechanisms of canon formation—both reflecting and reinforcing the underlying systems of cultural selection and valuation. AI models would embed the structural norms and biases that govern the intended data selection and organisation. The process of creating image datasets for Generative AI, often viewed as a neutral or purely technical process, encodes not only cultural narratives, but also biases and power dynamics. This paper explores the potential of these models as dynamic, generative systems that create outputs based on patterns, but also potentially embody and transmit cultural values. In doing so, the parameterisation of culture through dataset compilation emerges as a tangible phenomenon: as the algorithm seeks to capture shared patterns and structural rules among the samples, it parallels the concept of a “canon-as-model”. Presented at the 2024 Artistic Research symposium, this ongoing project foregrounds the interplay between programming practices and cultural encoding, where datasets act as reflections and agents of cultural normativity within AI-driven systems, with implications for how culture itself may be shaped in the age of AI.

AI-generated image of examples of paintings as canon that show a variety of genres
Figure 1. Output image from prompt without fine-tuning

The Social Impact of AI

The 2024 Artistic Research symposium showcased diverse perspectives on artificial intelligence (AI), with some participants embracing the technology and others critically examining societal issues related to data collection and AI implementation. My current research, presented then as work-in-progress, focused on utilising widely popular generative models to critique contemporary cultural definition and categorisation processes. As such models are developed based on theories of human cognition, they could be seen as mirrors reflecting our societal values and biases given that they would show how we think we think. This brings me to the title of the research project “Encoding Culture”, which explores how creating datasets for image-generating AI parallels the political and aesthetic logic behind national cultural canons. The project can be interpreted as the act of coding culture, as cyphering as well as programming it.

Midjourney, for example, is a multimodal environment that combines a large language model (LLM) with image generation capabilities that have become extremely popular, with a recognisable aesthetic that is very present in the current visual art and design realms. It is a diffusion model trained on various datasets and machine learning algorithms to generate high-quality images in various styles from text prompts.[1] For my research, I selected images from Swedish visual art history to train the model to produce works inspired by Swedish art to fine-tune a Stable Diffusion 3.5 medium model, as it is open source, unlike Midjourney or Dall-E.

The project I am currently leading is hosted at the Stockholm University of the Arts, and is in its early stages. We have established a small computing cluster equipped with mid-range GPUs to train models and test various concepts. This initiative, known as the Critical AI Working Group (KAIA), aims to evolve into an interdisciplinary research hub fostering critical discourse on AI and the arts. Our focus includes exploring how AI reshapes audiovisual creation, the performing arts and the broader media landscape, while also addressing ethical, social and political implications such as algorithmic bias, copyright issues and the risks associated with automating creative processes. Currently, KAIA functions primarily as a reading and discussion group, sharing and discussing publications on AI. Coordinating consistent participation remains a challenge due to the diverse primary research commitments of our members, nevertheless, we are committed to gradually developing this side project into a significant contributor to the discourse on AI in the arts.

Background: Why Do We Need a Critical Approach to AI?

The need for a critical approach to AI arises from the ethical, social and cultural issues surrounding its use, particularly in machine learning and, more specifically, deep learning technologies applied to artistic creation. These are issues deeply embedded within the project presented in this paper.

It is important to note that while the term AI is commonly associated with generative visual art created by diffusion models like Midjourney, artificial intelligence encompasses a much broader field. AI has been employed in artistic creation since the 1970s, long before the current wave of generative technologies.[2] Therefore, equating AI art solely with visual generative art overlooks the extensive range of creative applications facilitated by AI across various domains.

AI’s role in the arts extends beyond visual mediums, encompassing numerous forms of creative work generated through machine learning and deep learning tools. Understanding and critically examining these technologies is essential as they continue to shape and redefine artistic expression in contemporary culture. One of the central concerns is algorithmic bias, as algorithms are not impartial, despite the common perception that they are objective or neutral. As they are created by humans, human biases, whether intentional or unintentional, are inevitably embedded in the data and decision-making processes that shape these systems.[3]

Yet bias can, in certain contexts, be useful. When searching for specific information or filtering data, for instance, bias helps refine results. However, in widely deployed AI models, bias becomes problematic because these systems can lead to false positives or one-sided results. People assume that computational processes are inherently objective, but this misconception ignores the reality that algorithms reflect the values, beliefs and cultural biases of their creators.[4] Even the “style” of programming—decisions about what data to prioritise and how to structure models—embeds specific cultural perspectives.

Algorithmic bias is particularly discussed in relation to racial and ethnic disparities, because the datasets used to train AI systems often reflect existing social hierarchies. This is evident in surveillance technologies as well as in artistic creation. Facial recognition models, for example, have historically struggled to accurately identify people with darker skin tones due to training datasets being predominantly composed of lighter-skinned individuals.[5] This bias has led to misidentifications and wrongful arrests.

A notable case in the US involved a Black man who was misidentified by a surveillance system. The lawyers successfully argued that the failure was due to biases in the dataset, highlighting how incomplete or skewed training data can have real-world consequences.[6] But such biases extend beyond law enforcement, including in the creative sector, where AI-generated art is trained on datasets that have disproportionately reflected Western artistic traditions, thereby further marginalising non-dominant aesthetic forms and traditions. Another critical issue is the automation of the creative process. Generative models, such as those used in AI-generated art, have been widely adopted, but they pose fundamental questions about authorship and labour. I invite the reader to familiarise themself with these issues to understand the relevance of the project presented here.

Local Context: The Impact of Automation on Creativity and Artistic Labour

While AI models can generate appealing images, they do not engage in creativity in the human sense. Instead, they generate work based on statistical patterns derived from pre-existing human-created artworks.[7] However, I have argued that even though they lack intentionality, errors or hallucinations could be interpreted as machinic creative expressions.[8]

Such AI models rely on datasets built from the labour of artists, designers and other creatives: without their original contributions, AI algorithms would have nothing to learn from. Artists whose works could be absorbed into training corpora without their consent are aware of this and some want to fight back. The current digital art market, for example, creates a paradox: artists share their works to gain visibility and commissions, but once they are online, their work can easily be scraped and repurposed to train AI models—and particularly in illustration, it can potentially undercut careers.

To counteract this, digital tools like Glaze and Nightshade have been developed. These allow artists to subtly distort their images in ways that disrupt AI training. By altering pixel-level details, the software introduces changes imperceptible to the human eye that interfere with machine-learning models, reducing the effectiveness of AI-generated copies.[9] Essentially, they force AI to learn from corrupted data, diminishing the quality of the generated outputs or inducing hallucinations.

Global Context: Ethics, Extraction and AI in a Time of War

AI’s impact on creativity is substantial, but its broader risks—to society, labour and human life—are more urgent, especially when automated decision-making is deployed in military contexts, therefore I opt for not completely disengaging from AI, but promoting an informed approach to it. While automation in the creative sector does indeed threaten jobs and artistic agency, the deeper ethical dilemma is the delegation of critical human decisions to AI. As automated decision-making systems are increasingly used in areas such as healthcare, law enforcement or finance, their most notable (and dangerous use) has been in warfare, often without sufficient oversight or accountability.[10] In my view, the intersection of algorithmic abstraction and real violence must not be treated as tangential to the arts; if we are serious about AI ethics, we must be serious about the consequences of its usage, which raises urgent questions about responsibility and control in any context.

When an AI system makes a consequential decision—whether in surveillance, judicial sentencing or warfare—who is accountable? The opacity of AI decision-making processes, combined with the perception of computational neutrality, creates an accountability gap that has profound societal consequences. Not only because of the potential of social manipulation but, as decision-making is delegated to machines, the subtleties of discerning targets linked to moral subjectivities disappear. As military applications of AI are increasingly used in real-world scenarios, the shift of making decisions that were once made by humans to machines raises profound ethical and legal concerns.[11] There are many testimonies of children being targeted by intelligent drones in Gaza.[12] Artists and researchers must resist becoming complicit or staying silent in AI’s extractive economies—whether of labour, culture, planetary resources or life itself. AI requires vast computational resources, consumes large amounts of electricity and necessitates extensive cooling systems that in turn require significant water consumption. Already in 2019 it was observed that the carbon footprint of training a single deep-learning model can be equivalent to the emissions of five cars over their lifetime.[13]

Yet one of the most pressing issues of what is called the “Fourth Wave” of AI is that of deep fakes, which have become widespread in recent years. [14] The rapid development of generative adversarial networks (GANs) has made it possible to fabricate highly convincing videos, leading to many cases in which people, including older audiences, are deceived by synthetic media, contributing to disinformation.[15] Yet the problem has evolved: AI-generated videos have reached a level of realism at which even experts struggle to differentiate them from authentic footage. The creators of these deep fakes often select subjects that appear plausible—such as wildlife rescues in the Arctic—making detection even more difficult.

The widespread dependence on AI also raises concerns about general skill degradation. People are increasingly outsourcing fundamental cognitive tasks—such as writing and critical thinking—to AI systems like ChatGPT. This shift risks diminishing essential human competencies,[16] while reinforcing opaque algorithmic decision-making structures that are difficult to scrutinise.[17] Furthermore, the concentration of AI development in the hands of a few powerful corporations exacerbates economic inequality and centralises technological power in ways that benefit a small elite while disempowering others.[18]

Strategy: Harnessing the Possibilities of Artistic Research within AI

As AI is not merely a technological tool, but a socio-technical system shaping and shaped by human choices, biases and past and present power structures, a critical approach is essential. While the arts allow both objective and subjective, speculative, emotional, creative and disruptive approaches, it is precisely this flexibility that makes it a great context in which to tackle AI and its associated issues. In contrast, the scientific community often emphasises objectivity and precision, leaving speculation and critique aside. Where art embraces the fluidity of interpretation, science, particularly AI research, operates within stricter, logic-driven frameworks, sometimes leaving social aspects aside. This is why “Encoding Culture” is framed as an artistic research project that delves into a particular ongoing social event: that of the Swedish government´s intended development of a cultural canon, consisting of a collection of works deemed representative or essential to Swedish culture, including artistic and engineering samples.[19] The proposal, led by historian Lars Trägårdh and supported by political groups, such as the Swedish Democrats, has already sparked considerable controversy, as critics contend that establishing a canon risks imposing rigid cultural parameters that could foster an authoritarian approach to national identity, while supporters argue that it serves to recognise and celebrate Sweden’s unique heritage.[20] My objective with this project is to go beyond such polarisation and offer an alternative model—an actual AI model, a generative system rather than a fixed list of works.

At the time of the publication of this paper, the Swedish Cultural Canon had been made public. From the perspective of my project, which relies on a broad dataset, the canon’s “image and form” list would constitute a rather small sample, given it consists of only ten artworks. Yet it might be relevant to mention that a few of the works in my training dataset for the “Encoding Culture” project do coincide with items in the proposed canon.[21]

Through a Different Lens: The Swedish Cultural Canon as Dataset

It is worth noting that the idea of a canon is not new; it has long served as a means of imposing order on cultural expressions, setting boundaries on what is deemed acceptable or valuable. Once established, however, a canon becomes the central reference point that all subsequent cultural interpretations could end up gravitating towards or against. The fact is that the Swedish approach echoes similar initiatives in other countries. Denmark’s cultural canon, for example, was published as a book—a static record that, while authoritative, lacks the flexibility to accommodate new discoveries, such as the posthumous recognition of an obscure painter’s work.[22] In contrast, the Netherlands opted for an interactive website, allowing for continual updates and a broader representation of cultural narratives, including those from its former colonies.[23] Latvia offers yet another model, with an interactive map that invites users to explore regional art samples, thereby engaging the public in a more immersive and exploratory manner. For Maija Burima, the canon would have two meanings: first, as “a collection of high quality, significant, influential, authoritative works” and then it could be understood as a set of “regulations or methods (…) Belonging to the canon “is a guarantee of quality, and that guarantee of high aesthetic quality serves as a promise”[24]

Central to the debate is the question of what exactly a cultural canon represents. Historically, the term “canon” dates back to classical Greek culture, when it signified a standard or rule—literally meaning a “rod” used for measurement. Polyclitus’s treatise on ideal human proportions is a prime example: it set forth a standard that influenced how art was taught and understood for centuries. As George Kennedy has argued, the formation of a canon is an inherent human impulse—a way to distil a multiplicity of options into a coherent, preserved tradition.[25] Mike Fleming expands on this, noting that beyond it originally denoting a measuring rod, it has evolved into an authoritative list that guides cultural judgement.[26] A canon, therefore, is not simply an assemblage of works; it is a structure built upon specific rules and values that determine what is worthy of preservation. One could say that it becomes the standard incarnated, the model to use as reference, as happened with Polyclitus’s sculpture, whose measurements are still used as if it were the mould for how to draw humans in art schools.[27] This is why establishing a canon in art means setting a framework within which cultural identity is both defined and contested—where all throughout modernity this model has been reformulated and distorted, hence the controversy.

Meanwhile, a dataset is a structured collection of data—images, text or numbers—organised for analysis, training or reference by computational systems, especially in machine-learning contexts. The data is selected depending on the desired learning outcomes. In this sense, gathering such data is like making a collection of samples—of say, artworks that represent a culture. Understood as such, my approach is not necessarily meant as a direct critique of the Swedish Cultural Canon initiative. Rather, it is an exploration of how such a canon might be reimagined—not as a static, prescriptive list, but as a dynamic model that reflects the evolving nature of culture itself. Yet acknowledging all the issues that come with the attempt to parametrise culture, when a canon is understood as the implementation of a set of rules, it becomes a framework that dictates what is permissible and what is not, establishing boundaries within which culture is shaped: a defined region of expression that leaves little room for deviation. When we consider AI models in this context, it raises the pertinent question what exactly is an AI model?

Methodology: The Canon as Model as Method

An AI model is an algorithm, a mathematical system, trained on a dataset to recognise patterns, make inferences and decisions based on those patterns, without the need for continuous human intervention or being explicitly programmed for a specific task. These models apply various algorithms to relevant inputs to achieve specific goals or outputs. For example, IBM describes AI models as programs that detect specific patterns by analysing datasets, drawing conclusions and acting upon them.[28] In many cases, these models are not only used to recognise patterns but also to predict future trends or process new data.

AI models find application in diverse fields—from image and video recognition to natural language processing (as seen in models like ChatGPT), anomaly detection, recommender systems, predictive modelling, forecasting and robotics.[29] Hewlett-Packard’s definition reinforces this notion, emphasising that models are trained to detect specific patterns, thereby learning the underlying rules that generate these patterns. A model does indeed not merely recognise a pattern superficially; it absorbs the rules embedded in the data it processes, interacting with human input and refining its understanding. In this way, it learns from the data’s inherent biases, mirroring the biases present within the culture from which the data is drawn.

Shaping the Canon: From Polykleitos to Policy

My artistic method starts with the drawing of a parallel between the concept of the canon and Generative AI (GenAI) models. Similarly to the canon (as a model), an AI model learns and stores (as an algorithm) a set of rules from data as parameters for generating new patterns when prompted (a probabilistic negotiation in between user and AI model) to do so. While not linguistically equivalent, a cultural canon and a GenAI model both provide a structure from which creative output can be derived. This is a crucial departure from traditional thinking, in which the canon was an immutable standard: in the proposed approach, a canon-as-model becomes a dynamic set of rules that both reflect and shape cultural ideals as identity.[30] This may also be a novel way to approach GenAI models: when extending this discussion to generative AI, it must be noted that this is a reference to models that utilise neural networks and deep learning to identify patterns within existing data to generate new, novel content. Foundational models, such as GPT-3, GPT-4 and Stable Diffusion operate on this principle: they create text or images based on input prompts by drawing on learned examples from vast datasets. ChatGPT, for instance, generates text from short prompts, while Stable Diffusion produces photorealistic images from textual descriptions. The underlying process is similar in both cases: the model creates novel patterns based on the rules it has learned from its training data.

GenAI models, especially visual ones, learn by recognising commonalities and differences in the data. In the case of diffusion models, the process involves the deliberate introduction of noise into an image (forward diffusion) followed by the gradual removal of this noise (reverse diffusion) in multiple iterations. This iterative process enables the model to discern complex patterns and structures, effectively learning how to generate realistic images from an initially chaotic input.[31] This method of learning from noisy data is shared by GANs and diffusion models alike. They evolve by identifying consistent patterns while also learning to distinguish finer details, thereby demonstrating how models can generate coherent content from what appears to be disorder. This iterative refinement is not merely a technical process; it reflects how structured rules and parameters guide the creation of both output and algorithms.

Training the Model: A Canon in the Making

To test whether a generative model could internalise a sense of “Swedishness” in art, which is after all the goal behind the Swedish Cultural Canon, I aimed to fine-tune—the task of training a pre-trained model on specific data so it learns to do exactly what one intends to use it for—Stable Diffusion 3.5 on a carefully selected dataset of Swedish visual culture. Moving to the concept of multimodal datasets, the selection process here involves both image and text as the system uses two models working in tandem. In other words, multimodality in this context refers to the use of multiple modes to communicate a message.[32] A multimodal project may include text and images, but also motion or sound, where even websites are inherently multimodal since they combine text, images and code. This approach thus encompasses more than language alone—it draws on various semiotic resources to create a richer, more holistic communication experience.[33]

The Stable Diffusion 3.5 medium model I used is composed of a text-to-image generative model that relies on three main components: CLIP, Variational Autoencoder (VAE) and a U-Net-based denoising model (nowadays diffusion transformer-based).[34] To put it in lay terms: Stable Diffusion takes the text (the prompt or training data), “understands” what you want using the CLIP (language processing) model that turns sentences into numbers. This is a special kind of code that captures the meaning of the words for future use in image generation (the dataset for fine-tuning defines what kind of image we are aiming for). The compressed space (produced by the VAE, known as “latent space”) represents the original detailed image like a compact sketch—a simplified version that is easier for the model to work with. But this space is largely a black box to us: we can see the numbers, but we don’t fully understand how they capture the image’s features. In turn, the U-Net or transformer (the actual diffusion model) gradually turns noise into an image, learning to iteratively remove noise from latent representations and gradually producing a recognisable image that matches the text prompt. The CLIP text embeddings are semantically entangled with the U-Net/diffusion model in the whole process. This architecture allows Stable Diffusion to efficiently generate high-quality images while maintaining statistic control via text prompts.

Fine-tuning of Swedish Visual Heritage

For “Encoding Culture”, multimodal AI is used and therefore the models need to be trained concurrently.[35] The images had to be edited to specific sizes, while the texts, consisting of detailed description of the image it was paired with, should be delivered as .txt files, where the system recognises the pairs when these have the same filenames in the dataset folder used for training. As can be seen in the sample text + image pair below, the text should be as detailed and descriptive as possible.

Seventeenth-century painting of a Swedish royal man with long hair and a shiny blue sash across his upper body alongside a descriptive text
Figure 2. Training data for the dataset: a text and image pair used (with text written by a student)

The training process can be defined as follows:

  1. Preparing the Dataset: This was done by manually scraping pictures of representative Swedish artworks of different eras (before contemporary art) and then cleaning these image-text pairs (e.g. adding captions or prompts) so the system would accept them: SD 3+ (Stable Diffusion 3.5) requires pictures to be in a specific format.
  2. Configure Training Environment: A locally running 4090 GPU was used. This required the hyperparameters to be manually and carefully set depending on computer specs (learning rate, batch size, steps, etc.)
  3. Preprocessing the Data: Resize/crop images. Tokenise text with CLIP tokeniser. Encode images into latent space using the VAE.
  4. Training Method: Fine-tuning.
  5. Train the U-Net (or Transformer for SD 3+): The system uses the latent representations and text embeddings (descriptive text files) to learn the denoising process (how patterns appear once noise is removed).
  6. Periodically Save Checkpoints: So one can resume or evaluate intermediate results. In my process, this was important: my dataset was small, so there was always the risk of overfitting, that is, when a model learns the training data too perfectly—including its flaws—and performs poorly when generating new, previously unseen data.
  7. Testing: Run Inference (Sampling): Use the trained model to generate images from new text prompts.
  8. Evaluate & Refine: Test image quality, alignment with prompts and adjust training as needed or choose the checkpoint that is more convenient for the test.

Validation and Output Analysis

After the previous process was done, I generated images using the specific environment known as Comfy UI (UI as in User Interface), which is an efficient tool for prompting and outputting the desired Swedish Art images in real time, as it is browser-based.[36] With its visual layout of boxes and adjustable parameters, Comfy UI simplifies complex programming tasks—making it particularly intuitive, friendly and attractive to artists who might otherwise be challenged by traditional coding. Its node-based system is designed to streamline the creative process, allowing focus on the visual and conceptual aspects of work.

One of my initial experiments with Stable Diffusion 3.5 medium was straightforward: I began with the basic prompt “Swedish art” to see what the non-fine-tuned SD 3.5 was “understanding” of this prompt. The output, while generic, carried a subtle Swedish signature. Although not entirely distinctive, certain visual elements suggested that the work could be interpreted as Swedish art.

Node-based user interface of Comfy UI showing an image generated by Stable Diffusion that can be interpreted as Swedish Art
Figure 3. “Swedish Art” generated by the non-fine-tuned SD model

I then refined the prompt to “Contemporary art from Sweden”. This modification resulted in a more nuanced image, which, despite not having been directly trained on Swedish art, intriguingly incorporated the colours of the Swedish flag. This association emerged organically within the model, underscoring how cultural symbols can be embedded even in the absence of explicit instruction. Even though this output may still appear generic, the inclusion of the Swedish flag’s colours signals a cultural connection that hints at a broader artistic tradition. When I further adjusted the prompt, the resulting abstract piece retained the Swedish colour palette without directly depicting the flag. If presented in an art exhibition, one might comment on the piece’s apparent foundation in art historical study—perhaps reminiscent of works from the 1950s—due to the evocative use of colour. Conversely, in the absence of these colour cues, the connection to Swedish art might not be immediately apparent.

This phenomenon highlights the crucial role of culture, memory and associative cues in the interpretation of art. The Swedish flag’s colours serve as a visual shorthand, a cue that prompts viewers to easily (and rather uninterestingly) associate the work with Swedish identity, even if no explicit artistic tradition is being followed. In this way, the AI’s output underscores how cultural context is vital to our interpretation of visual art.[37]

Node-based user interface of Comfy UI showing an image generated by Stable Diffusion that can be interpreted as Modern Swedish art
Figure 4. “Contemporary Swedish Art” generated by the non-fine-tuned SD model

Discussion: Hybridity, Bias and the Possibility of Synthetic “Swedish” art

Now, shifting to a practical example: as stated before, the images shown were generated using the Stable Diffusion 3.5 medium-sized model, a state-of-the-art generative AI that interprets textual prompts to produce images. The model has been pre-trained on a large image-text pair dataset, as shown before. For the image generated at the start of this paper, I provided the prompt “Swedish Cultural Canon” to see what associations the large model would produce. As Stable Diffusion 3.5 is open source, it can be downloaded, modified and retrained on personal hardware, allowing for experimentation with customised datasets.

The prompting strategy I used was intentionally simple: I wanted to analyse what the model already “understood” about the concepts it was prompted with. As the term canon has strong historical connotations, particularly in literature and dating back to the 1500s, this could be seen in the output. Such a term had originally re-emerged in literary contexts to refer to a cultural corpus of texts but was later adopted in the visual arts, where it referred to established artistic ideals.[38] This historical weight clearly influences how AI models interpret the term: it was present in its original training dataset.

When generating images related to Swedish Cultural Canon, I was curious to see if the model would produce something distinctly Swedish. However, identifying uniquely Swedish elements in an AI- generated image is complex. The output bore similarities to certain works of art from before the twentieth century but distinguishing national identity in historical art is challenging. Many post-medieval Northern European cities shared architectural and artistic styles, making it difficult to pinpoint purely Swedish visual traits. This raises a fundamental question: what does the model recognise as Swedishness?

When Is Art National and When is it Global?

This question is deeply tied to the formation of nation-states, as national identities within such governance systems often incorporate historical elements that relate to events that predate the state’s official formation but that happened within their current territory. Cultural artefacts that existed in the geographical region now known as Sweden may, for instance, be retrospectively claimed as part of part of national history, even if they predate Sweden as a unified political entity.[39] This is a common phenomenon in national historiography.

My work ultimately explores the encoding of culture, and not only through AI; while it considers how artificial intelligence processes and represents cultural concepts, and the underlying assumptions embedded in these systems), it also asks how culture itself is parametrised. When I speak of parametrisation of culture, I refer to this process of encoding—understood as simplifying—cultural expressions, styles or values into discrete, computable elements to be managed digitally—such as data points or rules—thereby translating complex, fluid cultural phenomena into latent algorithmic forms (and parameters).

As the very notion of a cultural canon is controversial, involving the defining of what is deemed essential and valuable in a given culture, using GenAI to engage with this concept could reveal both the implicit biases in machine learning models and the broader implications of cultural selection processes in the digital age. In the process of my project there were two unintended outputs: my learning about Swedish art in a historical framework, and the understanding that the GenAI models could be seen as some sort of dynamic meta-archives, storing parameters of possibilities rather than actual digital documents.

Group of images that form the ancient and medieval portion of the project dataset
Figure 5. Sample of dataset showing ancient and medieval painting from the Swedish region

Learning about Swedish History by Putting Together a Selection of Images

The dataset contains some paintings that predate the formation of Sweden as a nation, including painted runes. Despite them predating the formation of a Swedish state or kingdom, I considered these are representative of Sweden. I want to stress that this decision was based on a subjective consideration on my part, informed by what is perceived as “Swedish history”. The dataset also includes medieval art, although there are not many examples from that period, with the majority being from the 1400s. Interestingly, there is a notable gap in Swedish art from the 1700s.

This absence is partly due to a fire, but also because Sweden was at the time opening up to Europe and apparently began hiring French painters, especially for portraying royalty.[40] Before that period, Swedish art had been heavily influenced by German artists, especially during the 1500s, when the primary painter of Swedish religious art was German.[41] As a result, the style was predominantly Germanic, making it difficult to distinguish a specific Swedish style in those times. It is especially hard to pinpoint a Swedish style in the 1700s, since many of the works were heavily influenced by French art. The style during that time was therefore stylistically very much French. As for earlier periods, another part of the dataset includes antiquities from the prehistoric and Viking eras, as well as paintings from the 1400s.

AI-generated images in a hybrid style in between rune-like and medieval paintings
Figure 6. Images in a hybrid style in between rune-like and medieval paintings, generated after fine-tuning the model

By observing the dataset, it appeared that Sweden had developed a distinctly local style in the 1800s, especially towards the end of that century. Artists with a strong inclination towards illustration played a significant role in defining this aesthetic, with Carl Larsson standing out as a particularly influential figure. His style was characterised by a clear and unmistakable visual language, which fitted more within an illustration tradition. His works offer a dual insight: the subject matter as well as the execution can be understood as inherently Swedish. This is not common among all samples that are considered Swedish in its visual art history. The architecture and decorative elements of houses from that period, for example, bear unmistakable national traits, and even though clothing styles across Europe exhibited commonalities, specific details—such as the prevalence of blonde hair and blue eyes in the people portrayed—could be seen as visual markers of the Swedish context of the time (of Northern Europe). The works in question do not simply represent a visual style, they could be argued to embody an evolving cultural identity. The illustration style cannot be seen as sprouting from an exclusive Swedish tradition either: although it has recognisable characteristics that make it easy to identify Larsson’s work, illustrations from the time share stylistic qualities, such as strong lines.

The progression in style, as captured by the AI model during training, reveals a learning process that moves from a generic output to one that aligns with something that, for certain periods, can be defined as reflecting a national identity. With scarce yet sufficient data, the model has managed to replicate Larsson’s stylistic nuances—an achievement facilitated by the distinctiveness of his illustrations, which differ notably from the more emotive expressionist techniques found in other movements. This is significant because it illustrates how the AI model can discern and emulate stylistic features that are tied to historical and cultural narratives.[42]

AI-generated images of an expressionist-style painted landscape in light blue, green and yellow tones
Figure 7. The generic prompt generated a more modern style (expressionist), which raises the question whether this be defined as characteristically Swedish, as it maintains the yellow and blue colours of the initial flags.

Fine-tuning and Sample Outputs

The putting together of the dataset is a multifaceted task. It happened in two stages: first, I included works up to the 1700s, and, for a second fine-tuning, I incorporated a wider range of samples, from works produced until the 1800s to more modern art. The set even draws on contributions from an AI class in which students provided descriptive texts for images. These diverse inputs introduce a measure of subjectivity into the dataset. While my perspective remains somewhat detached—given my background and the absence of formal training in Swedish art history beyond the contemporary—the contributions of local scholars and practitioners could offer alternative interpretations that further enrich the dataset.

The training process implies the generation of validation sets that allow the monitoring of the training process at different steps.[43] At step 595, one could already recognise the learning of Hilma af Klint’s painting characteristics, providing a contrast and complement to Larsson’s style. A direct comparison between Hilma af Klint and Carl Larsson reveals overlapping characteristics and distinct differences, even when the same prompt is used. This phenomenon is captured by the concept of “style transfer” in generative AI—a technique through which an image is transformed into the style of a particular artist, akin to asking the algorithm to render a work in the manner of Vincent van Gogh.[44] It is possible to recombine content with this method too: the algorithm learns to replicate specific visual cues from each image by a specific artist, applying them to new images in a way that blurs the boundaries between imitation and originality.

Over time, “style” in this AI framework has come to function as little more than a visual filter (or content reconstructions). The process of transforming an image is now simplified; it focuses less on replicating the nuanced brushstrokes or the rich personal histories that once defined an artist’s work and operates more on applying a predetermined set of parametrised visual characteristics that imitate how the works superficially looked.

Comparison between two images: one the original, the other a generated image after the original and the fine-tuned model
Figure 8. Validation image next to generated test image

While some argue that the term style has lost its relevance and the digital age has led to oversimplification of the study of the arts,[45] it has generally been seen as a vital connector between historical continuity and the creative signature of individual artists in modern art.[46] The point I want to make here is that once these characteristics are defined (the style of an artist), they can be encoded (learned by the algorithm), enabling the generation of new art that conforms to an established aesthetic, yet also prompting us to consider whether reducing style to a mere filter diminishes the artist’s unique perspective. The personal, tactile connection associated with an artist’s individual brushstrokes, such as those of Van Gogh, can be imitated, albeit in a flatter way, even when style is reduced to an algorithmic filter.[47] Examining the previous image generated by AI, one can identify elements reminiscent of both Larsson and Hilma af Klint, such as the careful use of colour and the circular composition. These blended influences highlight how AI can simultaneously capture and merge multiple stylistic legacies, resulting in works that are both familiar and alien. The patterns and constructions within such images provide a window into how the GenAI model interprets and processes artistic traditions, which while distinct from human creativity does reflect the underlying cultural narratives embedded in the data and the process itself.[48]

Ultimately, the exercise of training AI to recognise and reproduce Swedish art raises profound questions about cultural identity. While the model is capable of learning and generating art that adheres to established stylistic norms, it also exposes the fluidity and hybridity of cultural expressions. As with any cultural canon, the Swedish artistic tradition is not static but rather an evolving tapestry of influences, biases and historical contingencies. This project, therefore, serves not only as a technical exploration of machine learning but also as a philosophical enquiry into the nature of cultural identity and the limits of artistic reproduction in the digital age.

Two images of a man: one the original, the other a generated image after the original and the fine-tuned model
Figure 9. Validation image (left) next to generated test image reminiscent of Carl Larsson (yet not identical)

Conclusion: A (Re)Generative Swedish Cultural Canon?

This paper addresses the implications of a possible cultural parametrisation through the implementation of a national cultural canon by exploring the creation of a speculative dataset intended for training GenAI models. By fine-tuning a model on Swedish visual history and prompting it with terms like “Swedish painting”, outputs began to merge elements of, say, rune art with medieval iconography—suggesting an emergent and hybrid sense of Swedish art, albeit an artificial one.

The research project suggests that the AI model can function in a similar way a cultural canon can; generating images based on a set of learned parameters from a dataset as cultural samples. The experiment demonstrates that an AI model, much like a cultural canon, operates by internalising a set of parameters that guide its subsequent outputs. In this way, the model becomes a generative engine for reproducing what we perceive as an interpreted tradition, while exposing the fragility and possible design of that tradition. The very act of selecting images for such training is not neutral: it is shaped by subjective decision-making, led by personal taste, historical knowledge or by the biases of societal visibility and recognition. This latter issue becomes clear when characteristics of works by artists that are more famous tend to be over-represented in the outputs, which highlights bias in the model’s learning process, a phenomenon that at the same time can be of use if one is to find which artist, style and/or content is representative of a culture. The result is a model that does not just generate images; it amplifies patterns of cultural inclusion and exclusion already embedded in its dataset. This recursion therefore mirrors the mechanisms of canon formation in the historical sense.

An artist who adheres to the Swedish canon would recognise these implicit parameters and work with them, possibly becoming a more traditional artist. Similarly, the AI model learns to generate art that reflects these established parameters (patterns/relations/rules). Yet the process also raises the question of how we define what is “Swedish” in art when historical influences are so intermingled. The complexity of this task becomes even more apparent when considering the impact of external influences throughout recorded Swedish art history. In the eighteenth century, Swedish art absorbed significant stylistic characteristics from other countries, especially France. French influences are substantial, yet they create a problem: defining what is “Swedish enough” inevitably excludes works that might otherwise contribute to a richer understanding of Swedish culture as a whole. While the runic style seen in Swedish art is present in other European traditions—such as Celtic art—the continuous exchange of ideas makes it difficult to isolate a localised, untouched “pure” aesthetic.

Images of seventeenth-century paintings that look French but are generated by the model under the “Swedish” rubric because paintings of the period do not seem to have a specific Swedish style
Figure 10. How “Swedish” are the samples generated from the prompt “1700s Swedish Painting”?

What does it mean to define “Swedish” art in a context in which stylistic and national boundaries are porous? When Swedish visual culture was already heavily shaped by French court aesthetics in the eighteenth century? Or when, in earlier centuries, German artists produced much of the country’s religious art while on Swedish soil? Like most cultures, Swedish art has been subject to changes by exposure to other traditions. So, if we were to come up with a model in the era of AI, we should start by understanding that culture is dynamic rather than static, that it evolves over time and that it rarely fits neatly into predetermined categories, while becoming richer and more complex through intercultural exchange. The “Encoding Culture” project may signal that cultural identity is not fixed, and although it could be relevant to take a “snapshot” of certain iconic artworks for easy historical reference and shared access, these should be understood as such—as samples, not as the reflection of a monolithic, uniform cultural lineage of expression that remains static over time.

By assembling the project dataset, I may be both harnessing and challenging traditional notions of what constitutes the culture of a nation-state. Does this depend on the artist origin? That is, where was the artist born or where the artist worked for most of his or her life? Or of the style used or the themes touched? If we go back to the situation in the 1500s, for example, when one of the most prominent artists in Sweden was German, a difficult question is raised: should the work of a foreign-born painter commissioned in Sweden be part of the Swedish canon? Does national identity reside in origin, style audience or context? Far from hypothetical, this question becomes operational when encoded into training data, as to define is also to delimit. I argue that once something is categorised, tagged, named, it can be encoded, modelled and repeated. But it can also be critiqued, reprogrammed and transformed. “Encoding Culture” suggests that any dataset-based canon must be understood as temporary and partial, albeit potentially dynamic and generative, rather than a stable and fixed foundation. It should be taken as something closer to a moveable frame than a rigid mould. This tension between assumed cultural fixity and computational flexibility is where the political stakes of generative could AI reside, once “released” into the wild—or the real world.

As we progress into contemporary art, the challenge of defining Swedish art will only become more pronounced. This could be more linked to innovation, for example. In fact, this project addresses two of the areas Trägårdh has expressed interest in—artistic expression and engineering—when AI is increasingly applied for and from the field. This makes this project even more relevant, as it is situated between the arts and the hard sciences. Yet, within the conceptual art legacy we already accept that the definition is the work, so in training a diffusion model, one must ask: are we capturing the higher meaning behind the cultural expressions, now seen as a whole, or merely gathering general characteristics in the form of learned and stored repetition of patterns? This becomes key when working with traditions as conceptually charged as canons, and this question is meant to remain open, ready to be discussed on a societal level.

What emerges from “Encoding culture” is not a fixed portrait of Swedish culture, but, maybe, a reflection of its intersections—between illustration and nationalism, tradition and machine, between engineering and aesthetics. As Lars Trägårdh’s canon committee engages in its own act of definition, this project offers an alternate approach: generative, dynamic, and unfinished. The images produced by the GenAI contain traits that many would recognize as Swedish, but they also display characteristics that are more universal.

GenAI systems, like canons, might be shaped by the cultures they inherit through the choices we make in putting together the datasets we feed them with. If culture is always in flux, then perhaps its most faithful representations are those that remain momentary. The process of encoding culture through multimodal dataset curation thus becomes both a technical exercise and a philosophical enquiry: not only into what culture is, what is authentic or representative or not in such a sphere, but also exposing who gets to define it—and for whom. Again, as culture is inherently fluid, the AI reflects this by generating images that, while identifiable by certain characteristics, are never rigidly fixed, thus, the method of training a GenAI model as Canon implies that the images in the dataset can always be increased, changed or deleted. 

This experiment, then, is a demonstration of how cultural traits evolve and how, even in a digitally driven process, it is possible to identify some markers of cultural identity—even if they are not absolute and largely depend on the viewer recognising familiar traits or not. As artists working with AI, we are not outside the systems we critique—we are both shaping and being shaped by them. In that sense, the task ahead is to contribute to models that reflect complexity without collapsing it into uniformity and exclusion. 

The project does not seek to define what Swedish art is or should be; rather, it stages a provocation: can cultural identity ever be defined through computational means without collapsing its inherent complexity? Should we even attempt to define a fixed cultural identity? While the model’s outputs may eventually feel “Swedish” to viewers, that perception is contingent upon color schemes, motifs, and their prior associations or knowledge of the subject. In contemporary art, where hybridity and criticality shape the field, treating such elements as fixed parameters—color schemes, motifs, and so on—can feel destabilizing. It is precisely this instability that should be embraced, not rejected.

Footnotes

  1. Midjourney. “Midjourney,” n.d. https://www.midjourney.com/home/. https://www.midjourney.com/home/ (Accessed 2025-09-22).
  2. Boden, Margaret A. “Creativity and Artificial Intelligence.” Artificial Intelligence 103, no. 1–2 (August 1, 1998): 347–56. https://doi.org/10.1016/s0004-3702(98)00055-1
  3. Noble, Safiya U. Algorithms of Oppression: How Search Engines Reinforce Racism. New York: NYU Press. 2018.
  4. Broussard, Meredith. Artificial Unintelligence: How Computers Misunderstand the World. Cambridge, MA: MIT Press. 2018.
  5. Buolamwini, Joy and Timnit Gebru. “Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification”. In Proceedings of the 1st Conference on Fairness, Accountability and Transparency, 77-91. 2018.
  6. Garvie, Claire, A. Bedoya, and J. Frankle. The Perpetual Line-Up: Unregulated Police Face Recognition in America.Georgetown Law Center on Privacy & Technology, 2016. https://www.perpetuallineup.org/ (Accessed 2025-07-30).
  7. Puri, Aina. “Original or Stolen? The Battle Between AI Image Generators and Visual Artists.” Cambridge University Law Review. https://www.culawreview.org/journal/original-or-stolen-the-battle-between-ai-image-generators-and-visual-artists (Accessed 22-09-2025)
  8. Torres Núñez del Prado, Paola. “AIELSON: A Neural Spoken-Word Poetry Generator with a Distinct South American Voice”. Journal of Interdisciplinary Voice Studies 7. no. 1. (2022): 11.
  9. Shan, Shawn, Jenna Cryan, Emily Wenger, Haitao Zheng, Rana Hanocka, and Ben Y. Zhao. “Glaze: protecting artists from style mimicry by text-to-image models.” In Proceedings of the 32nd USENIX Conference on Security Symposium, 2023.
  10. Eubanks, Virginia. Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor. New York: St. Martin’s Press. 2017.
  11. Westhues, A. Heinz. “The Militarization of Artificial Intelligence and Autonomous Weapons.” UNISCI Journal 67 (2025): 111–30. https://www.unisci.es/wp-content/uploads/2025/01/UNISCIDP67-4HEINZ.pdf (Accessed: 2025-03-24).
  12. White, Marcus. “Gaza Surgeon Describes Drones Targeting Children.” BBC News, November 13, 2024. https://www.bbc.com/news/articles/c7893vpy2gqo (Accessed 2025-03-27).
  13. Strubell, Emma, Aanya Ganesh, and Andrew McCallum. “Energy and Policy Considerations for Deep Learning in NLP.” In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 3645–50. Florence, 2019.
  14. Lee, Kai-Fu. AI Superpowers: China, Silicon Valley, and the New World Order. Boston: Houghton Mifflin Harcourt, 2018.
  15. Ofcom. “A Deep Dive into Deepfakes That Demean, Defraud and Disinform.” Ofcom, July 23, 2024. https://www.ofcom.org.uk/online-safety/illegal-and-harmful-content/deepfakes-demean-defraud-disinform/ (Accessed 2025-03-28).
  16. Zhai, C., S. Wibowo, and L. D. Li. “The Effects of Over-Reliance on AI Dialogue Systems on Students’ Cognitive Abilities: A Systematic Review.” Smart Learning Environments 11 (2024): 28. https://doi.org/10.1186/s40561-024-00316-7
  17. Danaher, John. “The Threat of Algocracy: Reality, Resistance, and Accommodation.” Philosophy & Technology 32, no. 1 (2019): 23–38. https://link.springer.com/article/10.1007/s13347-015-0211-1 (Accessed 2025-03-28).
  18. Zuboff, Shoshana. The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power. London: Profile Books. 2019.
  19. The Local. “To Become a Citizen, You Should Know Something About Sweden’s Values.” The Local, March 25, 2025. https://www.thelocal.se/20250325/to-become-a-citizen-you-should-know-something-about-swedens-values (accessed 2025-03-28).
  20. Swedish Government. Cultural Canon Report. 2023. See also Trägård, Lars. Cultural Identity and National Narratives.Stockholm: Kulturförlaget, 2022.
  21. Swedish Committee. “En Kulturkanon För Sverige.” Stockholm: Regeringskansliet, 2025. https://www.regeringen.se/rattsliga-dokument/statens-offentliga-utredningar/2025/09/sou-202592/(accessed 14/09/2025)
  22. Rosengaard, S., and T. Øland. “The Cultural Policy of Canons and the Role of Intellectuals.” Nordisk Kulturpolitisk Tidsskrift (2018). https://www.scup.com/doi/epdf/10.18261/ISSN2000-8325-2018-01-04
  23. Canon van Nederland. “Outlines”. https://www.canonvannederland.nl/en/hoofdlijnen (accessed 2025-03-28).
  24. Burima, M. “Integration of the Culture Canon in the Language Acquisition through the CLIL Approach: The Latvian Case Study.” In Proceedings of INTED2021 Conference, 2021.
  25. Google Arts & Culture. Questioning the Canon: Baltimore Museum of Art. Produced by Gamynne Guillotte et al. https://artsandculture.google.com/story/questioning-the-canon-baltimore-museum-of-art/UgXhHC1bTnB0Kg?hl=en
  26. Mohr, John W., Christopher A. Bail, Margaret Frye, Jennifer C. Lena, Omar Lizardo, Terence E. McDonnell, Ann Mische, Iddo Tavory, and Frederick F. Wherry. Measuring Culture. New York: Columbia University Press, 2020.
  27. Boys-Stones, G. “Polyclitus among the Philosophers: Canons of Classical Beauty.” In The Body and the Arts, edited by C. Saunders, U. Maude, and J. Macnaughton, 11–24. London: Palgrave Macmillan, 2009. https://doi.org/10.1057/9780230234000_2.
  28. IBM. “What is Artificial Intelligence (AI)?” IBM, 2020. https://www.ibm.com/topics/artificial-intelligence (Accessed 2025-03-28).
  29. Hewlett Packard Enterprise. “What is Artificial Intelligence (AI)?” HPE, 2020. https://www.hpe.com/us/en/what-is/artificial-intelligence.html (accessed 2025-03-28).
  30. Bourdieu, Pierre. The Field of Cultural Production. New York: Columbia University Press, 1993.
  31. Ho, Jonathan, Ajay Jain, and Pieter Abbeel. “Denoising Diffusion Probabilistic Models.” Advances in Neural Information Processing Systems, 2020. https://arxiv.org/abs/2006.11239.
  32. Jewitt, Carey. “An Introduction to Multimodality”. In The Routledge Handbook of Multimodal Analysis. London: Routledge, 2011.
  33. Rombach, Robin et al. “High-Resolution Image Synthesis with Latent Diffusion Models”. arXiv preprint, 2022. https://arxiv.org/abs/2112.10752v2.
  34. Sharma, Bhomik. “Stable Diffusion 3.5: Paper Explanation and Inference.” Learn OpenCV.  https://learnopencv.com/stable-diffusion-3/ (Accessed 2025 06-20).
  35. Technical advisor: Jonas Jonsson.
  36. Anonymous. “Introduction.” Comfy. https://docs.comfy.org/ (Accessed 2025-03-28).
  37. Darda, Kohinoor M., and Anjan Chatterjee. “The Impact of Contextual Information on Aesthetic Engagement of Artworks.” Scientific Reports 13, no. 1 (2023): 4273. doi: 10.1038/s41598-023-30768-9.
  38. Fleming, Mike. Arts in Education and Creativity: A Literature Review. 2nd ed. Newcastle upon Tyne: Creativity, Culture and Education. 2010.
  39. Hobsbawm, Eric, and Terence Ranger. The Invention of Tradition. Cambridge: Cambridge University Press, 1983.
  40. Neville, Kristofer. The Art and Culture of Scandinavian Central Europe, 1550–1720. University Park: Penn State University Press, 2019. doi:10.5325/jj.22247104.
  41. Stockholms läns museum. “Albertus Pictor—Svensk Medeltids Störste Målare.” https://stockholmslansmuseum.se/utstallningar/medeltidakyrkor/albertus-pictor-svensk-medeltids-storste-malare/ (Accessed 2025-03-28).
  42. British Museum. “A History of Storytelling Through Pictures.” British Museum Blog. https://www.britishmuseum.org/blog/history-storytelling-through-pictures? (accessed 2025-03-28).
  43. Hugging Face. “Basic Training with Diffusers.” Hugging Face Documentation. https://huggingface.co/docs/diffusers/en/tutorials/basic_training (accessed 2025-03-28).
  44. Gatys, Leon A., Alexander S. Ecker, and Matthias Bethge. “A Neural Algorithm of Artistic Style.” Journal of Vision 16, no. 12 (2016). arXiv:1508.06576.
  45. Alpers, Svetlana. “Style is What You Make It.” In The Concept of Style, edited by Berel Lang, 137–62. Ithaca: Cornell University Press, 1987.
  46. Gombrich, Ernst H. Norm and Form: Studies in the Art of the Renaissance. London: Phaidon Press, 1966.
  47. Benjamin, Walter. “The Work of Art in the Age of Mechanical Reproduction” [1935–39]. In Illuminations, edited by Hannah Arendt, translated by Harry Zohn, 217–51. New York: Schocken, 2008.
  48. Manovich, Lev. The Language of New Media. Cambridge, MA: MIT Press, 2001.