Hello, hope that you're doing well!
I have a large private gallery (NSFW, won't link), with around 1 million images in it spread over ~500 albums. It's essentially my web-portal to my gallery-dl folder.
It runs surprisingly smooth, but certain queries are extremely slow, and adding tags to every image basically make the site unusable.
Example of particularly slow queries:
"display a calendar by creation date"
SELECT DISTINCT id FROM piwigo_images INNER JOIN piwigo_image_category ON id = image_id WHERE 1 = 1 AND date_creation IS NOT NULL ORDER BY date_creation DESC, date_creation ASC, file ASC, id ASC (this query time : 52.066 s) (total SQL time : 72.627 s) (total time : 72.632 s) (num rows : 1087767 )
???? why distinct(id) ???? it's the primary key, it's already distinct
--> There's edge-cases where images can be in multiple galleries (categories), so it'd yield multiple entries for the same image. I personally don't use this function. And I feel like have a string/array field in the database containing one or multiple category IDs would be much faster to search through
SELECT * FROM posts WHERE string_to_array(category_string, ' ') @> '{category1}';
???? why sort 'date_creation' first DESC, then ASC ????? crazy
????? why have the piwigo_image_category in a separate table. Is it not possible to put it in the piwigo_images to avoid INNER JOIN
Sorting by multiple factors is wasteful
I suspect that counting the images by year (even though it's purely aesthetic) is taking a really long time
SELECT DISTINCT(YEAR(date_creation)) as period, COUNT(DISTINCT id) as nb_images FROM piwigo_images INNER JOIN piwigo_image_category ON id = image_id WHERE 1 = 1 AND date_creation IS NOT NULL GROUP BY period; (this query time : 20.071 s) (total SQL time : 20.074 s) (total time : 20.078 s) (num rows : 15 )
For information that's only being displayed as a number if you hoover over the year, this is a very wasteful query...
"searching by tag"
SELECT id FROM piwigo_images i INNER JOIN piwigo_image_category ic ON id=ic.image_id INNER JOIN piwigo_image_tag it ON id=it.image_id WHERE tag_id IN (6) GROUP BY id ORDER BY date_creation ASC, file ASC, id ASC (this query time : 0.557 s) (total SQL time : 13.882 s) (total time : 13.901 s) (num rows : 15467 )
really slow to do multiple inner join, and to ORDER BY multiple fields. Very wasteful
group by id...? as an alternative to distinct(id)??
"opening the admin panel"
SELECT COUNT(*) AS ext_counter, SUBSTRING_INDEX(path,".",-1) AS ext, SUM(filesize) AS filesize FROM `piwigo_images` GROUP BY ext ; (this query time : 4.254 s) (total SQL time : 9.753 s) (total time : 9.794 s) (num rows : 5 )
? Can't we just add an 'ext' database field? Seems wasteful to compute this again and again. I added an index on filesize and path because this query used to take 60+ seconds.
"looking at an image" (though not always slow)
UPDATE piwigo_images SET hit = hit+1, lastmodified = lastmodified WHERE id = 957357 ; (this query time : 2.773 s) (total SQL time : 3.373 s) (total time : 3.383 s) (affected rows : 1 )
"most visited"
SELECT DISTINCT(id) FROM piwigo_images INNER JOIN piwigo_image_category AS ic ON id = ic.image_id WHERE hit > 0 ORDER BY hit DESC, id DESC LIMIT 15 ; (this query time : 39.774 s) (total SQL time : 39.778 s) (total time : 39.784 s) (num rows : 15 )
?? why distinct(id), it's the primary key...
I don't actually understand why it's doing the inner join here. The "most visited" page isn't specific to albums, it's the overall toplist. And I assume that it's limited to 15 because doing it this way is so slow. Are the categories also being used for access control? Would be much faster to integrate this into the piwigo_images table. There is no reason why listing the top 15 should take half a minute. Sorting 'hit' with an index from the piwigo_images table would take milliseconds
"searching regular"
SELECT id from piwigo_images i WHERE (CONVERT(file, CHAR) LIKE '%search%' OR MATCH(name, comment) AGAINST( 'search searches' IN BOOLEAN MODE)) (this query time : 3.219 s) (total SQL time : 3.241 s) (total time : 3.247 s) (num rows : 0 )
Crazy query. Just put a FULLTEXT index on 'file' and 'name'
Piwigo is running bare-metal on a dual-xeon system (8+8 cores) with 196gb ram. The database is on ssd.
I understand that piwigo wasn't intended for such large galleries (or maybe it was? ;)). But could you advice me if there's anything that I can change to make this faster? I'm almost at the point where I'll just dig into the php myself to change these queries, but I don't want to fork piwigo because the development seems to be progressing quite nicely.
Thanks for your time. Hope to hear from you
Piwigo 13.8.0 Check for upgrade Installed on 9 July 2022, 1 year 2 months 3 weeks 6 days ago Operating system: Linux PHP: 7.4.30 (Show info) [2023-10-05 21:03:17] MySQL: 5.5.5-10.6.12-MariaDB-0ubuntu0.22.04.1 [2023-10-05 21:03:17] Graphics Library: External ImageMagick 6.9.11-60 Cache size 65983.40 MB calculated right now Refresh Admin Tools AStat.2 Automatic Size BatchCustomDerivatives DefaultCreationDateToToday Grum Plugins Classes.3 GThumb+ LocalFiles Editor Properties Mass Update SortOrders VideoJS
Last edited by redtails (2023-10-07 23:23:12)
Offline