Will AI really help your Art Direction?

Part two: Art Direction

January 17. 2025

Open Graph Image Illustration

This essay is part of a multi part series about the impact of LLMs on game production with a strong focus on creative and art production in the games industry. But I'm sure most of the reasoning will apply to other environments as well. Read Part 1 for more on AI and the promise of efficiency.
This is a long post. If you just want the tl:dr check out the Takeaway. I added some additional Reads and Videos below. Feel free to check them out

AI systems have been a very hot topic (not only= in art-production circles over the last few years for a number of reasons. One part of the crowd is quick to highlight the new “tool” and the promises of efficiency while other part is pointing to a number of issues, such as copyright concerns, the recklessly disruptive impact on the industry as well as the long term societal effects of the technology. These effects include the loss of a high volume of jobs as well as a lack of artistic education based on the presumption that there will be no entry level jobs any more. All of this is incredibly deep to discuss and there have been a number of great studies, videos and essays on LLMs impact and state in relation to art production. I can only encourage you to get into the topic and try to understand these implications with an open mind.

In this second part of the series I want to discuss the economic and the stylistic implications of the technology for your team and point to the troubles AI brings to the table when it comes to managing your team and providing room for expression and growth. I absolutely encourage you to also read part one since it contains a few points that will be interesting here too.

What is Art Direction

The common perception of art direction is that we start painting “cool” things and until we find the best artistic vision for the project. That is of course part of the process but in reality something that is only a minor part of what an art director will have to do in their day to day work. Art direction is about enabling your team to give their best and develop their abilities to help you find and realize the aforementioned artistic vision. The larger the team and the later into the project cycle you are, the less art direction will be about hands-on art and more about coaching, feedback and management.

That however does not relieve you from your duties. As an art director you do have to know about artistic processes, art fundamentals, human perception, psychology of images and colors or color theory. You definitely should have the largest visual library in your mind on your team and always be on the lookout for inspirations, details, moods and trends. You have to bring everything together and it is your job to guide, to inspire and to help fix things. Sure, sometimes your job will be hands on art again (especially in smaller teams) but if you lack the fundamentals mentioned above, you cannot be an art director.

All of that means one thing more than anything else: as an art director you must be learning for all your life. It’s a constant chase to keep up with trends, news, tech and hone your skills in the process. If you stop learning you fall behind. Your mind gets stale and rusty and your ability to create visions will stagnate.

But, LLMs are a new technology! It’s best to learn it and keep up, right?

Learning LLMs for art direction

There is a huge number of people that are selling courses for “integrating AI in your skillset” or art academies adding AI to their curriculum. Every artist that has tried AI in seriousness (I have) will give you one thing: You do not need any kind of prolonged training, any course or let alone a period of studies to get anything out of LLMs. The process is very easy, the learning curve is incredibly flat compared to what a life of artistic growth and development demands. The only thing you need is certain wording for the specific LLM you are working with and once you get the hang of it or copy it from other “AI artists” (they say copying is OK, so you might as well?) you’re good to go. Don’t pay anyone to teach you AI.

The premise for learning art is a very different one. You have to learn an incredibly large amount of different things, be curious about details long and take into account a huge volume of different influences to get anywhere near the level of a professional artist. You must put in a stupid amount of hours, read and do a lot of research and learn about very scientific applications, anatomy, technology and many other fields in addition to finding your own style and soul for your work.

I hope you can see the pattern. Starting off with LLMs as an artistic career is a recipe for failure. Not because you cannot do “beautiful” images fast. It’s because you don’t know everything else around art, that you should know in order to decide what to do with those beautiful images.To know how to fix and alter it to fit certain needs. To know about consistency in your project context. If you give an LLM to a fully fledged artist, they know what to do with the output. But with the promise of “we will do it for you, just say what you need!”, will any young student of the arts be willing to put in the hours that are required?

Let’s say you don’t care about raising a new generation of creatives. Your only concern might be that your art team will be more efficient with AI then!

Art and efficiency, do they get along?

Being more efficient through the use of LLMs. What does that even mean? I wrote about efficiency in Part One and I can only recommend that you read it here.

tl:dr it usually means more output in less time. You can look at things in many different ways. And I am sure you will hear a lot of opinions about how people strive to be fast and efficient in their art processes. So you might think: there is a place for efficiency in art production after all! But let’s take a step back first. What is it that LLMs promise to bring to the table?

For starters the most advertised feature is generating any image with any content in a very short time. That is the draw of the big players (like OpenAI). Universally trained LLMs that will allow you to generate a high volume of variations based on a prompt. Its usefulness in actual art production for games is limited. But use has already propagated into Editorial as it seems easy enough to generate any image you had in mind with a certain style. It is first and foremost used extensively in memes, scams, misinformation and various kinds of pornography (many of which are illegal). The ability to transfer any style somewhat competently to any subject enables a kind of remix process which also concerns a lot of artists that developed distinct styles through a lifetime of hard work.

Other, less mainstream, uses involve actually training an LLM on very specific outputs (e.g. a style and subject very prominently used in your game) and then using it to generate variations of that subject. The use case is clear as this directly plays into asset production. This may be less morally objectionable in the common perception if the dataset really only includes your own assets. The process is very involved though and requires a lot of setup and technical knowledge. But there are gains to be had with that use of the technology. Large companies like Disney aim to train on their extensive library of assets in order to (presumably) cut down on production costs/time while game studios sometimes try to make it work for generating a large number of assets. Proprietary training is often not feasible for smaller studios, while dedicated providers again raise ethical and legal questions

From a top down view all LLMs do the same things: they riff on a set of existing data and provide stochastic variations of it. Limited datasets provide narrow output capabilities while broad datasets tend to provide a more diverse outcome at the cost of convergence. Which brings us to our next topic.

The art production cycle

When people are looking at AI outputs you can hear the phrase “it looks like AI”. People refer to a specific feeling of eerie “perfection” and a kind of dreamy sheen that all LLM images seem to sport. It’s clearly noticeable on the common LLM outputs you will find on social media (architecture, people, models, landscape, porn) and becomes less noticeable on stylized outputs (apart from the common errors like six fingers and twisted hand anatomy we all love so much). LLMs seem to home in on certain subject features and a certain blurriness which stems from the diffusion process. LLM providers are improving their models a lot to cope better with anatomy and certain other aspects, but for now it seems the technology has reached a certain plateau that can somewhat be remedied in the future by increasing the volume of training material. But by limitation of that technology, the effect of things being too “strange” and “samey” (sometimes called conversion) is not going away. At most it will become less noticeable with future developments.

With conversion being the preface let us look at the art process again and what it means to create original and believable characters, stories and assets. Strongly stylized art and rough concept art have a very unique benefit. They sacrifice details for speed. The results are inherently vague and leave out a lot of details. Sure, in later stages of preproduction and then in production the concept art will start caring for details a lot more. But in early preproduction the art is positively vague. For the viewer this will leave a lot of room that is used to think about the subject and interpret a lot of their own visions onto the subject. It will facilitate ideas, brainstorming and the development of more refined ideas which will ideally lead to a complete vision. Whenever I see projects failing it is because of a lack of direction. In art direction specifically that means failing at providing a confident artistic guideline for the team.

Back to the topic. Being vague early and explicit later in the production process does a lot for the health of the art production on any given project. It allows you to iterate early (when you actually should and need to) and forces you to provide the information you need to make it believable and coherent later in the process. Spoiler warning: LLMs help on neither stage.

Why would I say that? As you read earlier, LLMs give us “nice” images and a high output volume. It should be perfect for iterating? The problems with LLMs in the art production process are threefold. Let’s look at them one by one.

Developing a vision

LLMs create very detailed and fully realized images. In the context of early art production (pre production), at a stage where we should still be looking for a style and subject, we overburden our mind with a flood of detailed images that leave next to no room for imagination.

Let me explain. The human mind is very good at pattern recognition. We are trained to do so. In fact, even if we hear about something we know it instantly pops up in our mind. Please follow me for a short experiment:

DO NOT think of a blue elephant sitting in a lemon tree grove on a sunny day.

Did you see it? Even if I told you to not think about it? I did. Most people do. The human mind processes had quite a few thousand years to cook. In the large we all work similar. Most of us aren’t savants and geniuses. And even if you believe that you are one of the few genius art directors that actually is a super genius, I highly doubt it. You should doubt it too. Carry forward that the large processes of your mind work the same for you as for most other people. So what does that mean for us? If we look at a fully realized “nice” output image of an LLM, everything is there. Our minds will instantly fixate on certain elements and “we” will be in that image with our minds. There will be no room left to find the special things that make some games and movies and books so special.

I hope you can see how LLMs are actually a hindrance in that process instead of helping. They are not a tool but a barrier to open mindedly developing your artistic vision.

Information Overload

We just talked about the human mind, but there is more to unpack. Human brains have a max capacity. Probably a max total capacity, we don’t know that yet. But they definitely have a max short term and mid term capacity. We cannot process an infinite amount of information in any given time frame. In order to make our brain ready again, we need to rest. A coffee break will get us ready for the next task while through sleeping our brain will learn and process important information for mid and long term storage. That is how we learn. Every one of us (except people with certain conditions like ADS or particular expressions of autism). The younger we are the more we believe that we can just stay awake forever and process info better than older people. While it is true that young people have the ability to regenerate and memorize faster, the overall process stays the same. If you overdo it (pulling a few all nighters in a row) will drive your performance into the ground (in addition to the serious health risks that poses).

For example: in order to design good user interfaces and a good user experience designers need to know about this and respect this. In order for visual design (characters, environments, props, really everything) to work we need to balance high and low detail frequency because the viewer will either be bored (low detail frequency everywhere) or overloaded (high frequency detail everywhere) and they will look away.

In short: Too much information incoming and our minds will shut off. That is also the reason why super extensive reference boards can be more hindrance than help. Well curated references help us because we can still oversee and process all information. Have you ever been in the situation where your customer or director gave you their “references” as huge, unfocused and unsorted folders or imageboards and you were just confused? That’s why. We simply cannot easilyy process the amount of information on display. Here we can loop back to LLMs. If you ever played with an LLM in search of inspiration most users overcommit through fear of missing out on the “perfect” reference or idea. But from hundreds of AI images users will often fall back onto the earlier outputs because that is what for one still is in our short term memory before the overload set in and we already committed mentally to one of these (as discussed earlier). If you think that LLMs will help you iterate faster on an idea, think twice. Putting LLMs to good use will still require the same setup: Multiple flights of low volume output over a period of multiple breaks or days. Otherwise it will just be a mess. You cannot force your brain to work differently. And the onset of LLM tools will not change this (not in the next thousand years or so). It has, however, profound effects on your abilities, even short term, that you should be aware of. More on that later.

Attention to detail

The third major point I want to write about is attention to detail. I already wrote about how in later stages of production detail becomes crucial. Why? There are multiple reasons for that. Let us start with the internal reason every team should at least think about. Every piece of concept art will switch hands eventually and end up on the desk of a production artist (3D Modeler or 2D artist). As you should know they will have questions. A lot of them actually. What is this supposed to be? How should it work? Does it need to move? Among other questions. Sure, the problem is more pronounced in 3D pipelines but for the most part, even for 2D some of the same questions arise. So a good concept art should provide answers to those questions. Literal, visual answers and not a vague “make it work somehow”.

LLMs, out of the box, don’t provide those answers a good amount of the time. Details are either completely wrong and indecipherable or they seem kinda right but their application is wrong. Belts and bands vanish “somewhere” or connect to parts of the clothing that don’t need them really. Technical details, like machines, spaceship hulls or cyborg technology are just completely bonkers. Pistons are a concept that is harder for an LLM to get right than hands. As soon as you want an actual problem solved instead of a result that looks like an answer, LLMs fall apart. Using LLM outputs for concept art without fixing the details first is absolutely not advised. And if you start fixing LLM outputs you will spend an ungodly amount of time to set it right. You will notice how much of it is actually off or just plain wrong. In the end it is not uncommon to spend around the same time fixing LLM outputs that it would have actually taken you to get things right in the first place without an LLM. Try it in a serious manner and you will find out.

But even if you want to disregard the above for some reason there is another reason to tread carefully. The willing suspension of disbelief. Let us be serious for a moment. There are no aliens. No death stars. No goblin overlords. Magic weapons are rather rare in real life. But why are we willing to play a game, watch a movie or read a story about all of that? Not only read, but why do we devour/binge/game them like crazy? Because everything we read or see is believable enough that we are willing to give up reason in favour of a good story.

Here’s the caveat though: believable enough. As human beings we carry pretty good references with us about how things look. Because we look at so many things every day at least subconsciously we know how most of them should look. We know how heavy a hammer should be and if it looks right how people are handling the tool. We know what trees look like. We are experts in emotion and facial recognition. Sounds, smell really everything around us is taken in and kept ready by our brains to recall, display and mix and match. Just in case we need to talk to someone about a blue elephant in a lemon grove on a sunny day. This makes every single one of us an expert in checking the “realism” or believability of the things we see. The more realistic, the more extreme this effect is. And even in stylized things we tend to notice the details. Sure, if asked most of us cannot pin down the “why” it doesn’t look right. But we can sure tell “that” it looks off.

LLMs also don’t help with that because they do not know about “right” (They don't know anything as a matter of fact. They work with statistical relations). The output they produce might be convincing for a fast glance, but putting it before an observer for a longer time will make the LLM output fall apart. That is why they work so well as social media fodder or editorial images. Because it is fast food art that is mindlessly consumed and forgotten the moment you swipe away or turn the page. Is that a good proposition for the creative endeavors of you and your team?

Working with a team as a perspective

Which brings me to the last part of this essay. Since you just read about various mental effects and the basics of art and art direction that we need to take into account I want to write about one very important topic that is often left out of the general discussion when talking about art direction with the help of AI. That is working with your team and developing and maintaining skills in a team environment. Special skills and their development on individual team members play an integral role on three things: Motivation, the overall quality of the project output and the mental load on art directors and Lead Artists.

Motivation and overall quality as an aspect should be a no brainer. The higher the skill level of the individual team members and the higher the motivation the better the expected outcome. Motivation can balance lacking skill to a certain degree because motivation makes the one crucial task in an art production team somewhat easier: learning. Of course learning is the direct path to skill. Some project leads think that hiring people that already have 100% of the skills needed would be a path to success. In Game development especially that isn’t true most of the time. Technology changes so fast that old skills become invalidated and new skills necessary during the course of a two year project. New tools, new techniques and unique project requirements all contribute to that reality. In my experience hiring motivated individuals with around 70% of the required skill but high motivation and a portfolio to demonstrate the ability to adapt and learn is far more valuable.

The number one job you have as an art director or team lead ist to build skills and keep motivation high. Often Studio environments that purely focus on efficiency will lower the team’s motivation. They also fail to acknowledge that developed skill and a cohesive team is contributing to success in a huge amount as this enables growth and knowledge transfer. Skilled team members are thus not easily replaceable. Laying them off in favour of efficiency will lead to a sudden loss of skill in the team and put more pressure on the rest of the workforce in turn wearing them down quickly. This will lead to frustration and fluctuation and have a measurable negative impact on your project and mid to long term on your studio.

The use of LLMs as pillar production tools will facilitate loss of skill through layoffs and fluctuation and thus reduce the amount of knowledge and skill in your team’s workforce. But it will also affect users on a very personal level. Multiple studies have been published at this point underlying what we know in our day to day lives since forever: Once you stop practicing an ability, it will decline. Thinking and creativity behaves like any other ability we acquire in this regard. Once we start offloading creative work and critical thinking to someone else (AI or person, it does not matter) our own abilities decline and we start building reliance on someone or something else. For art especially that means different things depending on where you are in your art career.

Seasoned artists and directors will fall back into easy solutions where quality work would demand interaction with the theme and developing/finding your vision. They will lose attention to detail, story and worldbuilding more easily and be more prone to falling back to giving “lazy” direction through generated outputs. I have personally either witnessed this myself and heard accounts of that. As a matter of fact humans are lazy if they are allowed to.

Midlevel artists will fail to build skills. The lack of knowledge transfer coupled with lazy direction will fail them at building up essential skills they need to grow into directing others and to develop their own ideas and visions. Very important tasks and the love for attention to detail in order to answer pressing production questions will be neglected because the neglect usually trickles top down. Failing to grow that curiosity to understand how things work and look will stop your artistic and career development dead in the tracks. Just knowing the tools is not enough.

As a Junior you will be hit hardest. Entry level jobs are already becoming scarce because offloading simple image tasks to AI is easier and seems to save time of lead and senior artists. However, as I wrote above, the education of junior artists is an investment in the future skill portfolio and density of your studio! Getting a junior artist to senior level in a good studio environment will net you a motivated team member that knows the tools very well and possesses the basic skills to build upon and tackle more advanced tasks. Failing to bring up a new generation of artists as an industry will result in a huge demand for skilled workforce in 5 - 10 years that will be hard and costly to satisfy, if even possible.

Takeaway

Art direction is a demanding task that needs confidence and skill. Skill that you build over a long career and that you need to practice constantly. LLMs will not only impact your cognitive and creative abilities but also impair your ability to direct and develop your team. It will reduce mid to long term capabilities of your team and hinder or even prevent artistic growth among younger artists ultimately reducing the long term motivation and viability for your studio’s art production capabilities. While you might gain short term savings the long term effects will be felt by you, your team and the society as a whole.

Learning to use LLMs is a low skill, low investment task. Catching up is a non issue compared to what artists especially in game production are used to. Fearing to lose to the competition or missing a trend is a surreal fear with no base whatsoever in reality. If anything AI will get even easier to use and take even more responsibility from you.

Experimenting with certain aspects of LLMs, such as custom trained models, can be ethical but needs a high level of technological insight and a large pool of training data. Nonetheless the technology can yield good results for tasks that are too hard to survey or very tedious for humans, such as iterating over large amounts of data.

I'll write about other implications in Art Direction, Art Education and Project Vision in the last part of this series. If you are up for discussions feel free to get in touch on LinkedIn if you want to continue the conversation.

Further information

written by Richard Schmidbauer Richard has been working as Lead Artist and Art Director for indie games since his debut in the industry in 2007. He has been a gamer and a creative person his whole life. But part of him is driven to science and tech. That's why he sometimes tends to think about things in a rather analytical way...