Make it as simple as possible, not simpler The real "simple" solution is going back to single allocations The individual allocation ensures better data locality Which in turn means less cache miss events, and more performance But we need a basically different approach... The approach taken in the kernel is reversing the structure Instead of including the payload in the list structure the list structure is included in the payload itself #include struct list_head { struct list_head *next, *prev; }; static inline void list_add(struct list_head *new, struct list_head *head); static inline void list_add_tail(struct list_head *new, struct list_head *head); static inline void list_del(struct list_head *entry); /* .... */