Speedups: remove dependency on c++ (#2796)

* Speedups: remove dependency on c++

* Speedups: intset: handle malloc failing

* Speedups: intset: fix corner case for int64 on 32bit systems

original idea was to only use bucket->val if int<pointer,
but we always have a union now anyway

* Speedups: add size comment to player_set bucket configuration

* test: more tests for LocationStore.find_item

* test: require _speedups in CI

This kind of tests that the build succeeds.

* test: even more tests for LocationStore.find_item

* Speedups: intset uniform comment style

* Speedups: intset: avoid memory leak when realloc fails

* Speedups: intset: make `gcc -pedantic -std=c99 -fanalyzer` without warnings

Unnamed unions are not in C99, this got fixed.
The overhead of setting count=0 is minimal or optimized-out and silences -fanalizer (see comment).

* Speedups: don't leak memory in case of exception

* Speedups: intset: validate alloc and free

This won't happen in our cython, but it's still a good addition.

* CI: add test framework for C/C++ code

* CI: ctest: fix cwd

* Speedups: intset: ignore msvc warning

* Tests: intset: revert attempt at no-asan

We solve this with env vars in ctest now, and this fails for msvc.

* Test: cpp: docs: fix typo

* Test: cpp: docs: fix another typo

* Test: intset: proper bucket count for Negative test

INTxx_MIN % 1 would not produce a negative number, so the test was flawed.
This commit is contained in:
black-sliver
2024-06-12 18:54:59 +02:00
committed by GitHub
parent 2daccded36
commit acf85eb9ab
10 changed files with 450 additions and 19 deletions

View File

@@ -1,5 +1,6 @@
#cython: language_level=3
#distutils: language = c++
#distutils: language = c
#distutils: depends = intset.h
"""
Provides faster implementation of some core parts.
@@ -13,7 +14,6 @@ from cpython cimport PyObject
from typing import Any, Dict, Iterable, Iterator, Generator, Sequence, Tuple, TypeVar, Union, Set, List, TYPE_CHECKING
from cymem.cymem cimport Pool
from libc.stdint cimport int64_t, uint32_t
from libcpp.set cimport set as std_set
from collections import defaultdict
cdef extern from *:
@@ -31,6 +31,27 @@ ctypedef int64_t ap_id_t
cdef ap_player_t MAX_PLAYER_ID = 1000000 # limit the size of indexing array
cdef size_t INVALID_SIZE = <size_t>(-1) # this is all 0xff... adding 1 results in 0, but it's not negative
# configure INTSET for player
cdef extern from *:
"""
#define INTSET_NAME ap_player_set
#define INTSET_TYPE uint32_t // has to match ap_player_t
"""
# create INTSET for player
cdef extern from "intset.h":
"""
#undef INTSET_NAME
#undef INTSET_TYPE
"""
ctypedef struct ap_player_set:
pass
ap_player_set* ap_player_set_new(size_t bucket_count) nogil
void ap_player_set_free(ap_player_set* set) nogil
bint ap_player_set_add(ap_player_set* set, ap_player_t val) nogil
bint ap_player_set_contains(ap_player_set* set, ap_player_t val) nogil
cdef struct LocationEntry:
# layout is so that
@@ -185,7 +206,7 @@ cdef class LocationStore:
def find_item(self, slots: Set[int], seeked_item_id: int) -> Generator[Tuple[int, int, int, int, int], None, None]:
cdef ap_id_t item = seeked_item_id
cdef ap_player_t receiver
cdef std_set[ap_player_t] receivers
cdef ap_player_set* receivers
cdef size_t slot_count = len(slots)
if slot_count == 1:
# specialized implementation for single slot
@@ -197,13 +218,20 @@ cdef class LocationStore:
yield entry.sender, entry.location, entry.item, entry.receiver, entry.flags
elif slot_count:
# generic implementation with lookup in set
for receiver in slots:
receivers.insert(receiver)
with nogil:
for entry in self.entries[:self.entry_count]:
if entry.item == item and receivers.count(entry.receiver):
with gil:
yield entry.sender, entry.location, entry.item, entry.receiver, entry.flags
receivers = ap_player_set_new(min(1023, slot_count)) # limit top level struct to 16KB
if not receivers:
raise MemoryError()
try:
for receiver in slots:
if not ap_player_set_add(receivers, receiver):
raise MemoryError()
with nogil:
for entry in self.entries[:self.entry_count]:
if entry.item == item and ap_player_set_contains(receivers, entry.receiver):
with gil:
yield entry.sender, entry.location, entry.item, entry.receiver, entry.flags
finally:
ap_player_set_free(receivers)
def get_for_player(self, slot: int) -> Dict[int, Set[int]]:
cdef ap_player_t receiver = slot