-
- Downloads
[IMP] api: improve storing performance of the api cache
This revision moves the `cache_key` to the first level dict of the cache, instead of the last one. Doing so, we reduce the number of times the reference to the cache key is stored in the dict. For instance, for 100.000 records, 20 fields and 2 env (e.g. with and without sudo) formerly, there were 100.000 * 20 * 2 occurences of cache key references now, there is only 2 references. Storing references to an object consumes memory. Therefore, by reducing the number of object references in the cache, we reduce the memory consumed by the cache. Also, we reduce the time to access a value in the cache as the cache size is smaller. The time and memory consumption are therefore improved, while keeping the advantages of revision d7190a3f which was about sharing the cache of fields which do not depends on the context, but only on the cursor and user id. This revision relies on the fact there are less different references to the cache key then references to fields/records. Indeed, this is more likely to have 100.000 different records stored in the cache rather than 100.000 different environments. Here is the Python proof of concept that was used to make the conclusion that setting the cache_key in the first level dict of the cache is more efficient. ```Python import os import psutil import time from collections import defaultdict cr = object() uid = 1 fields = [object() for i in range(20)] number_items = 500000 p = psutil.Process(os.getpid()) m = p.memory_info().rss s = time.time() cache_key = (cr, uid) cache = defaultdict(lambda: defaultdict(dict)) for field in fields: for i in range(number_items): cache[field][i][cache_key] = 5.0 # cache[cache_key][field][i] = 5.0 print('Memory: %s' % (p.memory_info().rss - m,)) print('Time: %s' % (time.time() - s,)) ``` - Using `cache[field][i][cache_key]`: - Time: 3.17s - Memory: 3138MB - Using `cache[cache_key][field][i]`: - Time: 1.43s - Memory: 756MB Even worse, when the cache key tuple is instantiated inside the loop, for the former cache structure (e.g. `cache[field][i][(cr, uid)]`), the time goes from 3.17s to 25.63s and the memory from 3138MB to 3773MB Here is the same proof of concept, but using the Odoo API and Cache: ```Python import os import psutil import time from odoo.api import Cache model = env['res.users'] records = [model.new() for i in range(100000)] p = psutil.Process(os.getpid()) m = p.memory_info().rss s = time.time() cache = Cache() char_fields = [field for field in model._fields.values() if field.type == 'char'] for field in char_fields: for record in records: cache.set(record, field, 'test') print('Memory: %s' % (p.memory_info().rss - m,)) print('Time: %s' % (time.time() - s,)) ``` - Before (`cache[field][record_id][cache_key]` and cache_key tuple instantiated in the loop): - Time: 4.12s - Memory: 810MB - After (`cache[cache_key][field][record_id]` and cache_key tuple stored in the env and re-used): - Time: 1.63s - Memory: 125MB This can be played in an Odoo shell, for instance by storing it in `/tmp/test.py`, and then piping it to the Odoo shell: `cat /tmp/test.py | ./odoo-bin shell -d 12.0` closes odoo/odoo#29676 closes odoo/odoo#30554
Please register or sign in to comment