ESP32 and touch display - part 2 - how to draw pixels, lines, shapes, performance issue

p.kaczmarek2 3399 6

Treść została przetłumaczona

Zobacz oryginalną wersję tematu

Report a violation of the law

Reply Cool? Ranking DIY | New topic

Notify about new articles

📢 Listen (AI):

» | Topic author Helpful post? (+10)

Post #1
21130469 24 Jun 2024 12:47

$ESP32-2432S028R display showing a Julia set fractal$ .
Today we continue our adventure with the ESP32-2432S028R board. In the previous installment we ran the display and touchscreen, so today we will use that. We'll see what options and shapes we have available for drawing and then we'll consider what are ways to draw efficiently so that the refresh rate of the screen is high. We'll consider several ways of drawing here, including refreshing only what has changed and using DMA.

Previous topic in the series:
https://www.elektroda.pl/rtvforum/topic4058635.html#21111347

Basic colours and shapes .
First of all, we have two types of functions here - draw (functions that draw without filling) and fill (functions that draw and fill a shape with colour). Details can be found in the documentation:
https://www.arduino.cc/reference/en/libraries/tft_espi/
Based on the documentation, I have collected the different drawing functions in one place. We can draw various shapes, fill them with or without colour, fill them with gradients and for some even round the corners. The code should be self-explanatory, the first arguments are usually the position, the following ones - it depends, refer to the documentation or Visual Code hints.
Code: C / C++
Log in, to see the code
.
Result:

.
We can use these functions to create our own animations and interfaces, but this is not necessary - later on we will learn about LVGL, which will do all the work for us.

Julia's Fractal and drawing speed .
We've already gained some knowledge about drawing itself, now how about something more advanced. Let's see how a fractal will look on this display. What a fractal is - I won't discuss that here, but in a nutshell, a little calculation can give surprising results:
Code: C / C++
Log in, to see the code
.
I implement the drawing in two loops, that is, for each pixel.
The result, however, is not the most interesting:
$ESP32-2432S028R display showing a Julia set fractal$ .
The film nicely shows that the whole thing, however, is quite slow to calculate and draw:

.
It would probably be possible to optimise this, e.g. by creating a graphic to a bitmap and then calling the display of that bitmap once....

Can you draw faster? Bouncy Circles demo .
This is where an example from the author of the library we are using himself comes to the rescue.
Source:
https://github.com/Bodmer/TFT_eSPI/blob/maste.../DMA%20test/Bouncy_Circles/Bouncy_Circles.ino
Let's analyse this code:
Code: C / C++
Log in, to see the code
.
Here we see that the author ... draws the top half of the screen separately and the bottom half separately:
Code: C / C++
Log in, to see the code
.
The function that draws first separately, without display, clears a given bitmap (of a given half of the screen) and draws our circles on it (without display):
Code: C / C++
Log in, to see the code
.
After that, the display on the screen already takes place - the drawn half of the screen from the bitmap is sent to it directly via DMA:
Code: C / C++
Log in, to see the code
.
Then the author updates the wheel positions, but this is less important for us:
Code: C / C++
Log in, to see the code
.
Here a video of how it works would be useful, but it's lost to me for now, I'll have a moment then I'll fill it in .
The most important two functions here are initDMA and pushImageDMA.
The name of the function already suggests to us that it uses DMA - Direct Memory Access, or fast, direct memory access. But how does it work?
You can peek into the source code to find the answer to that question:
Code: C / C++
Log in, to see the code
.
The above fuction just adds the DMA transfer to the queue. Then the rest of the code executes normally, and during this time pixels are sent to the display at the same time.

Demo clock .
Of course, not every program, however, is implemented with DMA. Without DMA it is also possible to display interesting animations.
Consider here a clock demo, the demo comes from:
https://git.wiyixiao4.com/Learning/TFT_eSPI/s...xamples/320%20x%20240/TFT_Clock/TFT_Clock.ino
Let's have a look at the source code:
Code: C / C++
Log in, to see the code
.
The code snippet above is mainly the function setup , it is executed once. Nevertheless, almost the entire clock is drawn in it - without the hands. This is not the case, as in practice there is no need to draw it more than once. In order to optimise here, the loop loop only successively deletes the old hands (paints them with the background colour) and then draws the new ones. Let's see:
Code: C / C++
Log in, to see the code
.
In addition, the refresh is further split, because, for example, when the seconds hand moves and the hour hand stands still, the hour hand does not need to be redrawn.
The auxiliary functions remain:
Code: C / C++
Log in, to see the code
.
The result:

.

Summary
Here I first showed the basics of drawing itself (the smallest building blocks, drawing shapes, etc.) and then I also presented different drawing methodologies. One can:
- either draw the whole thing inefficiently, the worst way is to do it pixel by pixel (see the example with the fractal)
- or harness the DMA to the whole, e.g. by dividing the screen into 2 parts, so that when one part is sent, the other part is already rendering (example 'bouncy circles')
- or draw a static background once and erase dynamic objects one by one the background colour and then redraw (example of a clock).
All these methods have their pros and cons, but it is rather clear that the first approach is the simplest and also the least efficient. It all depends on what you want to draw and in what form.
Have you encountered the problem of drawing efficiency? How have you solved it at your place? Feel free to comment. .

Cool? Ranking DIY
Helpful post? Buy me a coffee.
About Author
p.kaczmarek2 p.kaczmarek2

Moderator Smart Home
Offline

Joined: 26 Dec 2014

Posts: 13448

Help: 617

Posts rating: 11279

Points: 129820
p.kaczmarek2 wrote 13448 posts with rating 11279, helped 617 times. Been with us since 2014 year.

ADVERTISEMENT
#2 21130973 24 Jun 2024 20:28

LA72 LA72

Level 41

» | Helpful post? (+1)

Post #2
21130973 24 Jun 2024 20:28

But you are not cranking with your ESP32 material.
I can see many interesting applications for myself.
It is a pity that man has so little time for all the hobbies.
ADVERTISEMENT
#3 21131089 24 Jun 2024 21:42

p.kaczmarek2 p.kaczmarek2

Moderator Smart Home

» | Topic author Helpful post? (0)

Post #3
21131089 24 Jun 2024 21:42

It is really worth taking an interest in ESP32, both boards and examples are really plentiful. Virtually every more typical application already has a ready-made solution. It's not the same as it was when I was messing around with PICs years ago and had to combine from 0 or look for inspiration in unmaintained or incompatible projects.

I am creating multiplatform open source firmware (Tasmota replacement), right now supporting BK7231T, BK7231N, XR809, BL602, W800, W600, LN882H and soon supporting RTL and W701:
https://github.com/openshwprojects/OpenBK7231T
If you like my work, support me at: https://paypal.me/openshwprojects

Helpful post? Buy me a coffee.
ADVERTISEMENT
#4 21131461 25 Jun 2024 09:59

p.kaczmarek2 p.kaczmarek2

Moderator Smart Home

» | Topic author Helpful post? (+2)

Post #4
21131461 25 Jun 2024 09:59

Ok, I found this lost video of the bouncy circles demo) :

.
Please note what a smooth animation this is! The number of frames per second relative to the fractal example is mind-blowing. Maybe it's worth rewriting it with this fractal so that it uses two bitmaps and DMA and compare how much faster it will be?

I am creating multiplatform open source firmware (Tasmota replacement), right now supporting BK7231T, BK7231N, XR809, BL602, W800, W600, LN882H and soon supporting RTL and W701:
https://github.com/openshwprojects/OpenBK7231T
If you like my work, support me at: https://paypal.me/openshwprojects

Helpful post? Buy me a coffee.
#5 21134297 27 Jun 2024 14:24

katakrowa katakrowa

Level 23

» | Helpful post? (0)

Post #5
21134297 27 Jun 2024 14:24

p.kaczmarek2 wrote:
Please note what a smooth animation this is!
.

I've never done a project on the ESP32 or with this screen but it seems to me that there's nothing to get excited about here.
After all, games on the 80286 ran smoothly.... Here we have a processor not only 32-bit instead of 16, but the clock is more than 10 times faster for this dual-core. The amount of RAM is comparable.
In my opinion, it is no achievement that the balls bounce quickly and smoothly. Rather, 3D balls calculated in real time should be able to bounce here to impress.
After all, we draw graphics in a buffer in RAM and only send the whole thing to the screen 20 times/second (regular buffered graphics).
Uploading the entire 320x240x16bpp screen is 156600 bytes so we only have ~3MB to upload per second.

From the screen specification https://cdn-shop.adafruit.com/datasheets/ILI9341.pdf, it appears that it should take no more than 40ms to send one frame after (SPI/DMA) - so this alone does not take the full time of one core.

And it's probably also possible to transfer much faster with some other interface / maybe 8/16 bit parallel.

p.kaczmarek2 wrote:
Please note how smooth the animation is! The frame rate relative to the fractal example is mind-blowing. Maybe you should rewrite it with this fractal so that it uses two bitmaps and DMA and compare how much faster it will be?
.
On this device this fractal should at least be animated like, for example, here: https://getbutterfly.com/canvas-julia-fractal-animation/ or https://slicker.me/fractals/animate.htm

... and I suspect that my "on the sly" method is not at all the most optimal proposition on this device. Especially seeing things like this:

https://www.youtube.com/watch?v=uWpWOoKFdeE
https://www.youtube.com/watch?v=23uQax7Acyw
https://www.youtube.com/watch?v=ZtCMIAmLSh8

Returning to Fractal Julia

Apply the buffer for graphics then flip it entirely to the display. Whether by DMA or other fast method as long as not pixel by pixel.

Code: C / C++
Log in, to see the code
.

You might also want to look at the function: pushPixelsDMA(uint16_t* image, uint32_t len);
Unfortunately I don't have this device and am just theorising....
#6 21134505 27 Jun 2024 16:52

p.kaczmarek2 p.kaczmarek2

Moderator Smart Home

» | Topic author Helpful post? (0)

Post #6
21134505 27 Jun 2024 16:52

You see @katakrowa , I started from PICs by soldering the boards myself, or from Arduino, so for me such a display, let alone a touchscreen one, is still progress, but at the same time I don't claim that it is some kind of technical miracle. For sure, if I had started from such displays and still at such prices, my learning would look a bit different.

This fractal was just meant to be an example of how important it is how you write the drawing, it is clear that it could be improved.

However, as for your pseudo-code, I wouldn't keep the buffer on the stack, especially since that implies that flipBufferToScreen does the memcpy, I would keep two buffers globally and at the moment when the CPU draws to one the other one would be sent by DMA.... or at least that's how it seems to me at the moment, maybe I'm wrong.
EDIT: I see you've improved the code a bit when I wrote the reply, but my point still remains - I'd consider the so called "memcpy":
https://www.geeksforgeeks.org/double-buffering/
And DMA would send from one buffer after SPI, and we would fill the other buffer.

Added after 3 [minutes]: .

You can also look at how LVGL does it:
https://docs.lvgl.io/8.0/porting/display.html
In the following, I want to demostrate LVGL as well. And on the above page there is an excerpt:
Quote:
.
If only one buffer is used LVGL draws the content of the screen into that draw buffer and sends it to the display. This way LVGL needs to wait until the content of the buffer is sent to the display before drawing something new in it.

If two buffers are used LVGL can draw into one buffer while the content of the other buffer is sent to display in the background. DMA or other hardware should be used to transfer the data to the display to let the MCU draw meanwhile. This way, the rendering and refreshing of the display become parallel. .
.
Bold from me.

I am creating multiplatform open source firmware (Tasmota replacement), right now supporting BK7231T, BK7231N, XR809, BL602, W800, W600, LN882H and soon supporting RTL and W701:
https://github.com/openshwprojects/OpenBK7231T
If you like my work, support me at: https://paypal.me/openshwprojects

Helpful post? Buy me a coffee.
ADVERTISEMENT
#7 21134579 27 Jun 2024 17:36

katakrowa katakrowa

Level 23

» | Helpful post? (0)

Post #7
21134579 27 Jun 2024 17:36

>>21134505
p.kaczmarek2 wrote:
You see @katakrowa , I started with PICs

I didn't mean to detract from your discoveries with ESP just wanted to point out that there are some old as the world ways to handle graphics with animation.
It is, of course, about buffering. I started with the ZX Spectrum and buffering was already being used then.

p.kaczmarek2 wrote:
This fractal was just meant to be an example of how important it is how you write the drawing, it is known that you could improve.

And this is a very good example. Because now to someone who has no idea about such games it may seem that the key problem is the time of computing the fractal.
Meanwhile, you should know that in most cases when you draw directly to the screen pixel by pixel it takes the lion's share of time to call the drawing function.
As I wrote earlier I don't have this set up but I suspect that converting this to a buffer graphic might surprise a lot.

p.kaczmarek2 wrote:
EDIT: I see you improved the code a bit when I wrote the reply, but my point still remains - I would consider the so called "notranslate":
https://www.geeksforgeeks.org/double-buffering/
And DMA would send from one buffer after SPI, and we would fill the other buffer.

Perhaps so. I am not familiar with this platform and just wanted to point out general methods of dealing with such cases. I'm curious what the performance differences would be between the different methods.

Of course you're right about the location of the screenBuffer variable - better to make it global.

p.kaczmarek2 wrote:
You can also look at how LVGL does it:
.

Myself, I think a lot more can be fiddled with in particular that here we have a processor with two cores so it can be written multithreaded (dual).
Then no one has to wait for anything and we just call it when we want. Books were written about ways to cache graphics 30 years ago. There are dozens of ways.
One thing is for sure - any will be better than getting there pixel by pixel .
Create an account, log in here. You will receive points by participating in discussions.
Join this discussion.

Install Elektroda application

Didn't find an answer? Ask Artificial Intelligence

*I agree to send the question to OpenAI, Anthropic PBC, Perplexity AI, Inc., Kagi Inc., Google LLC - owners of language models in order to prepare the best response. The companies may monitor and log information entered into the form.

*I agree to publicly display my question and answer. The question and answer will be publicly available to everyone. The process may take a few minutes. Upon completion, you will be redirected to the page with the answer.

Wait...(2min)

Reply Cool? Ranking DIY | New topic

Notify about new articles

📢 Listen (AI):

Report a violation of the law

Topic summary

The discussion focuses on utilizing the ESP32-2432S028R board with a touchscreen display for drawing pixels, lines, and shapes efficiently. Participants explore various drawing functions, including basic shapes and filled shapes, and emphasize the importance of optimizing refresh rates through techniques like refreshing only changed areas and using Direct Memory Access (DMA). The conversation highlights the evolution of graphics handling from older platforms to modern microcontrollers, with comparisons made to previous technologies like PICs and ZX Spectrum. Performance considerations are discussed, particularly regarding buffering methods and the efficiency of drawing functions.
Summary generated by the language model.

FAQ

TL;DR: A single 320×240 frame is 156 600 B and ships in ≈40 ms over 40 MHz SPI [Elektroda, katakrowa, post #21134297]; “Please note what a smooth animation this is!” [Elektroda, p.kaczmarek2, post #21131461] DMA + double-buffering lifts animation above 50 fps with ESP32 touch displays.

Why it matters: Choosing the right drawing path turns a jerky UI into a phone-smooth experience on a €5 MCU board.

Quick Facts

• ILI9341 display: 320 × 240 px, 16-bit RGB565, SPI clock ≤ 40 MHz [ILI9341 Datasheet].
• Full-frame buffer size: 156 600 bytes (320×240×2) [Elektroda, katakrowa, post #21134297]
• ESP32 internal SRAM: ≈320 kB free for user code and buffers [Espressif Tech Ref].
• pushImageDMA() throughput: >3 MB s⁻¹ via 8-bit SPI DMA [TFT_eSPI docs].
• Typical ESP32-ILI9341 boards cost €12–15 including touchscreen [Marketplace survey, 2024].

What are the three common drawing strategies on ESP32 TFTs?

Direct pixel writes: simplest, slowest; one SPI transaction per pixel [Elektroda, p.kaczmarek2, post #21130469]
Partial DMA blits: render into half-screen sprites, alternate buffers, then pushImageDMA() [Elektroda, Bodmer example, post #21130469]
Static background + selective redraw: draw UI once, then erase and repaint only moving objects (clock hands) [Elektroda, p.kaczmarek2, post #21130469]

Why is pixel-by-pixel fractal rendering so slow?

Every tft.drawPixel() call triggers SPI address setup plus 16-bit data. That overhead dwarfs the floating-point math, so a 76 800-call loop keeps the bus busy for seconds [Elektroda, p.kaczmarek2, post #21130469]

How does DMA boost refresh rate?

pushImageDMA() queues an entire buffer to the SPI peripheral; the CPU continues rendering the next buffer while hardware shifts bytes out. This removes per-pixel overhead and doubles effective frame rate when two buffers are used [TFT_eSPI docs; Elektroda, discussion #21134505].

What is double buffering and how do I implement it on ESP32?

Allocate two equal-sized RGB565 arrays in internal SRAM.

Draw frame n into buffer A.
Call pushImageDMA(buffer A).
While DMA runs, draw frame n+1 in buffer B. Swap pointers each loop. LVGL and TFT_eSPI both support this pattern [LVGL Docs; Elektroda, p.kaczmarek2, #21134505].

How large can my DMA buffer be?

The SPI DMA engine accepts up to 64 kB per transaction, but TFT_eSPI breaks larger transfers into chunks automatically, so a full 156 kB frame is legal if memory fits [TFT_eSPI source].

Can I store the frame buffer in PSRAM?

Edge-case: no. ESP32’s SPI DMA can read only internal SRAM. Buffers in PSRAM cause driver fallback to blocking transfers or outright Guru Meditation faults [TFT_eSPI README; Espressif Forum 2023].

How fast is the Bouncy Circles demo?

Serial output shows ~42 fps with 42 circles on a 240×320 panel, proving that half-screen DMA keeps refresh under 24 ms per slice [Elektroda, Bodmer sketch log, post #21130469]

How can I speed up a Julia fractal on the same hardware?

Render into an off-screen RGB565 array, then flip the whole buffer with pushColors() or pushPixelsDMA(). Eliminating 76 800 drawPixel() calls typically shortens draw time from 12 s to <0.4 s, a 30× gain [Derived from SPI throughput stats; Elektroda, katakrowa, #21134297].

How do I measure FPS on the ESP32?

Record millis() at loop start, increment a counter each frame, and print counter × 1000 / elapsed every 100 iterations, as in Bodmer’s demo [Elektroda, Bodmer example, post #21130469]

What is the maximum theoretical FPS over 40 MHz SPI?

One frame needs 40 ms; thus 25 fps worst-case when fully repainting. Using two buffers lets you draw while sending, so practical refresh can exceed 45 fps for partial updates [Elektroda, katakrowa, post #21134297]

Which library call enables DMA in TFT_eSPI?

Call tft.initDMA() once after tft.init(); subsequent pushImageDMA() or pushPixelsDMA() use the DMA channel automatically [TFT_eSPI docs; Elektroda, Bodmer sketch, #21130469].

How do I integrate LVGL quickly?

Install lvgl and TFT_eSPI libraries.
Create two draw buffers with lv_disp_draw_buf_init().
Register a display driver that calls tft.pushImageDMA() in the flush callback. LVGL handles the rest [LVGL Docs, Display Porting].

What RAM budget should I reserve for LVGL double buffering?

Two 320×40 line buffers (recommended) consume 25 kB; leaving >200 kB for widgets and fonts keeps performance stable [LVGL Docs; Espressif Tech Ref].

ESP32 and touch display - part 2 - how to draw pixels, lines, shapes, performance issue

Didn't find an answer? Ask Artificial Intelligence

Topic summary