google.tcmalloc 要点

作者: rockeet 发表日期: 2010年02月22日分类: C++ 评论: 0 条阅读次数: 2,658 次

线程缓存(thread cache)

加锁(locking)

加锁的开销是比较大的（cpu要同步 cache），一般要几十个时钟周期，特别是 cpu 较多的时候，锁冲突概率大大增加。继续阅读 →

google.sparsetable 实现细节

作者: rockeet 发表日期: 2010年02月08日分类: C++, HashTable 评论: 0 条阅读次数: 2,974 次

与现有的一些“标准”实现不同，sparsehashtable 使用二次探测法，而不是链接，来解决hash冲突。sparsetable 就更奇特了，它是一个两级结构，第一级使用直接索引法，而第二级使用的是顺序查找。
继续阅读 →

gcc 4.5

作者: rockeet 发表日期: 2010年02月07日分类: C++ 评论: 0 条阅读次数: 2,482 次

只拣一些有用的：
The -fshow-column option is now on by default. This means error messages now have a column associated with them. 继续阅读 →

my PipelineProcessor

作者: rockeet 发表日期: 2010年02月05日分类: C++, 操作系统评论: 0 条阅读次数: 2,947 次

刚看到，intel tbb::pipeline 实现的功能，和我以前实现的一个pipeline ： febird::thread::PipelineProcessor
，介绍：

1. 多线程的 pipeline 设计模式

2. 多线程 Pipeline 的改进

几乎就是同一个东西：多线程流水线执行，按次序，stage划分……

可惜啊，它只在我的几个程序中用过，现在有了tbb，它还没来得及壮大就得夭折了。

聊以自慰：

1. 英雄所见略同

2. 这个轮子很有价值

3. 发明轮子之前，先多看看

4. 多看看，碰到问题就免得发明轮子

简单的代码生成器创建领域语言

作者: rockeet 发表日期: 2010年02月05日分类: C++, shell, 操作系统, 杂谈评论: 0 条阅读次数: 3,511 次

有一类问题，代码模板相同，但有少部分地方不同，一般可以写一个复杂的程序，使用不同的选项，完成不同的任务。或者，把公共的部分抽象成一个代码库，然后在不同程序中引用。但是，如果公共的部分很少，并且比较“专用”，或者因为其它原因，比较难以部署。怎么办？

实际上，有另一种完全不同的编程模式来实现：代码生成器。unix世界中最知名的代码生成器莫过于lex和yacc了。但是，不比每个代码生成器都那么复杂，比如这个代码生成器就非常简单，它只是简单地转换行记录：

#! /bin/sh
field_seperator="||"
output=b
while getopts :F:vo: arg
do
case $arg in
        F ) field_seperator=$OPTARG;;
        v ) ;;
        o ) output=$OPTARG;;
        : ) echo "$0: missing arg for -$OPTARG " >&2
            exit;;
        /?) echo "Invalid option -$OPTARG ignored." >&2
            exit;;
esac
done
if [ $OPTIND -gt $# ]
then
#       echo OPTIND=$OPTIND argc=$# >&2
        echo "no program" >&2
        exit
fi
program=${!#}
echo field_seperator=$field_seperator
cat > a.cpp <<+TemplateCFile
#include <vector>
#include <string.h>
#include <stdio.h>
const char field_seperator[]="||";
void split_row(char* line, std::vector<char*>& F, const char* fs)
{
        char* col = line;
        F.resize(0);
        size_t fslen = strlen(fs);
        if (fslen == 1) {
                for (;;) {
                        F.push_back(col);
                        col = strchr(col, fs[0]);
                        if (col) {
                                col[0] = '/0';
                                col += 1;
                        } else
                                break;
                }
        }
        else {
                for (;;) {
                        F.push_back(col);
                        col = strstr(col, fs);
                        if (col) {
                                col[0] = '/0';
                                col += fslen;
                        } else
                                break;
                }
        }
}
int main(int argc, char* argv[])
{
        size_t  len1 = 0;
        ssize_t len2;
        char*   line = NULL;
        std::vector<char*> F;
        while ((len2 = getline(&line, &len1, stdin)) != -1)
        {
                split_row(line, F, field_seperator);
                int NF = F.size();
//--- begin user program
+TemplateCFile
echo $program >> a.cpp
cat >> a.cpp <<+TemplateCFile
//--- end user program
; // avoid user program missing ;
                printf("/n");
        }
        if (line) free(line);
        if (ferror(stdin)) {
                perror("ferror(stdin)");
                return 1;
        }
        return 0;
}
+TemplateCFile
sed -i 's//(field_seperator/[/]=/).*";//1"'$field_seperator'";/g' a.cpp
gcc -O2 a.cpp -lstdc++ -o $output
exit $?

#! /bin/sh

field_seperator="||"

output=b

while getopts :F:vo: arg

case $arg in

F ) field_seperator=$OPTARG;;

v ) ;;

o ) output=$OPTARG;;

: ) echo "$0: missing arg for -$OPTARG " >&2

exit;;

/?) echo "Invalid option -$OPTARG ignored." >&2

exit;;

esac

done

if [ $OPTIND -gt $# ]

then

# echo OPTIND=$OPTIND argc=$# >&2

echo "no program" >&2

exit

program=${!#}

echo field_seperator=$field_seperator

cat > a.cpp <<+TemplateCFile

#include <vector>

#include <string.h>

#include <stdio.h>

const char field_seperator[]="||";

void split_row(char* line, std::vector<char*>& F, const char* fs)

{

char* col = line;

F.resize(0);

size_t fslen = strlen(fs);

if (fslen == 1) {

for (;;) {

F.push_back(col);

col = strchr(col, fs[0]);

if (col) {

col[0] = '/0';

col += 1;

} else

break;

}

else {

for (;;) {

F.push_back(col);

col = strstr(col, fs);

if (col) {

col[0] = '/0';

col += fslen;

} else

break;

}

int main(int argc, char* argv[])

{

size_t len1 = 0;

ssize_t len2;

char* line = NULL;

std::vector<char*> F;

while ((len2 = getline(&line, &len1, stdin)) != -1)

{

split_row(line, F, field_seperator);

int NF = F.size();

//--- begin user program

+TemplateCFile

echo $program >> a.cpp

cat >> a.cpp <<+TemplateCFile

//--- end user program

; // avoid user program missing ;

printf("/n");

}

if (line) free(line);

if (ferror(stdin)) {

perror("ferror(stdin)");

return 1;

}

return 0;

}

+TemplateCFile

sed -i 's//(field_seperator/[/]=/).*";//1"'$field_seperator'";/g' a.cpp

gcc -O2 a.cpp -lstdc++ -o $output

exit $?

可以象awk一样写程序：

# 相当于 awk -F,  '{printf("%s/t%s/n", $1, $5)}'  
# 使用 ',' 做列分隔符，输出第 1 和第 5 个字段，生成二进制可执行程序 myprog  
./gencode.sh -F , -o myprog 'printf("%s/t%s/n", F[0], F[4])'

# 相当于 awk -F, '{printf("%s/t%s/n", $1, $5)}'

# 使用 ',' 做列分隔符，输出第 1 和第 5 个字段，生成二进制可执行程序 myprog

./gencode.sh -F , -o myprog 'printf("%s/t%s/n", F[0], F[4])'

# 相当于 awk -F,  '{printf(&quot;%s/t%s/n&quot;, $1, $5)}'
# 使用 ',' 做列分隔符，输出第 1 和第 5 个字段，生成二进制可执行程序 myprog
./gencode.sh -F , -o myprog 'printf(&quot;%s/t%s/n&quot;, F[0], F[4])'

# 相当于 awk -F, '{printf("%s/t%s/n", $1, $5)}'

# 使用 ',' 做列分隔符，输出第 1 和第 5 个字段，生成二进制可执行程序 myprog

./gencode.sh -F , -o myprog 'printf("%s/t%s/n", F[0], F[4])'

我当初写这个生成器的原因是发现非常简单的 awk 程序也比 C 慢 40 倍，以为这是本质上的性能差距，后来才发现不是。

对这个简单的程序，使用awk更方便更安全，也不比C慢，但是一旦碰到其它类似问题而 awk 解决不了，这种模式就可以派上用场了。

字符串基数排序

作者: rockeet 发表日期: 2010年02月05日分类: C++, 算法评论: 2 条阅读次数: 3,163 次

　　对字符串使用基数排序，以前，我一直觉得：因为字符串的长度不一，无法使用基数排序。前两天因为有需要，忽然想通了！即便长短不一，也可以使用链式基数排序！

　　首先，将字符串长度当作最低有效位，因为基数排序是从最低有效位开始排的，就先用分配-收集算法对长度做一趟。对字符串中的具体某一位字符进行排序相比，算法是一样的，只是写法稍有不同。要将排序结果的lenRadix指针保存起来，后面要用。

　　接下来，从lenRadix中取字符串最长的那个sublist，对该sublist排序，然后将这一趟的结果first保存起来，连接到长度次短的那个sublist之后，然后对这两个链接起来的列表进行一趟分配-收集。如此，直到最高有效位。

　　所有的工作就做完了，根据此算法，对所有待排序字符串中的每个字符，均需要一次且仅一次访问！另外，还需要
O((radix+1)*max_str_len)的时间复杂度用于扫描链接表，(radix+1)是因为还有一个strlen链接表。所以，总的时间复
杂度是O(n+(radix+1)*max_str_len)，其中N是所有字符串的总字符数。

　　在排序过程中，可以插入一个codetab，来实现不同的排序准则（例如忽略大小写），如果提供了wchar_t codetab，就按 wchar_t 排序，如果wchar_t codetab 非 NULL，就按转换了的 wchar_t 排序。

　　如果对unicode排序，最好指定一个codetab，把radix变小，不然的话，时间复杂度就太大了！

　　经过测试，在大约20000~30000个字符串的情况下，比std::sort快5~7倍。数据规模再增大，至5,000,000个字符串时，比std::sort大概快1.8~2.5倍！

代码：

http://code.google.com/p/febird/source/browse/trunk/febird/src/febird/radix_sort.cpp

http://code.google.com/p/febird/source/browse/trunk/febird/src/febird/radix_sort.h

测试(bench mark)代码：

http://code.google.com/p/febird/source/browse/trunk/febird/codelite/RadixSort/main.cpp

rdtsc 备忘

作者: csdn-whinah 发表日期: 2010年01月09日分类: C++ 评论: 0 条阅读次数: 2,724 次

继续阅读 →

内嵌变长数据结构范例——trbstrmap

作者: rockeet 发表日期: 2009年11月03日分类: C++ 评论: 0 条阅读次数: 2,606 次

以前的这篇文章介绍了嵌入的变长数据结构（embeded）

本文介绍一个使用这种思想实现的通用strmap容器，相当于：

std::map<std::string, Mapped>。

实现上使用了我以前写的线索红黑树——相比标准map的实现，节省了一半的结点存储开销，而平均查找时间只付出很小的额外开销，并且没有代码膨胀问题。

使用

大多数情况下

trbstrmap<MappedType[, Compare]> // Compare是可选的，allocator 总在类范围，可以修改:

trbstrmap<MappedType[, Compare]>::s_vtab.alloc = ….;

为了节省空间，trbstrmap使用一个字节存储strkey_length，因此，strlen(strkey)不可超过255，允许长度为0的key。最大长度255基本上可以应对绝大多数情况。

把key作为变长字段，有一个原因很重要：key一旦插入，就不可改变！因此，整个结点的memblock也就不需要重新分配，可以修改的只有mapped_data。

作为一个比较：如果把key作为一个固定长度的结构，而mapped_data作为变长内嵌数据，当需要修改mapped_data时，就需要重新分配该结点，还需要修改该结点的父节点指向该结点的指针，并且，会破坏“node容器的iterator只有在删除该node以后才失效”的语义！

一旦用在多线程环境中，问题就会变得很复杂，因为修改了父节点。就不能仅lock需要修改数据的那个结点，而是需要lock整个tree，使用数据库的术语，就是一下子从记录锁变成了表锁，本来按key查找时，因为key没改变，就不需要加锁，而这样搞，查找时就需要对整个表加锁，严重影响并发性！

注意事项：

1. trbstrmap::iterator::value_type 和trbstrmap::value_type都是 mapped_type/Value，*iter也是mapped_type；而不是std::map的pair<Key,Value>。

2. trbstrmap::key(iter)获得iter对应的key

3. trbstrmap::klen(iter)获得key_length，当然，strlen(trbstrmap::key(iter))也可以获得key_length，但需要一些时间开销；另外，当key_content中存在/0时，klen就是不可缺少的了（后面将详细介绍这种情况）

4. 再次强调，key_cstr_length最大长度不可超过255，但这个长度不包括末尾的 /0

示例代码：

typedef febird::trbstrmap<int, &trb_compare_tev_inline_cstr_nocase> smap_t;

smap_t smap, smap2;

smap[“1”] = 0;

smap[“2”] = 1;

smap[“3”] = 1;

smap[“4”] = 2;

smap[“1”]++;

smap[“4”] += 2;

smap.insert(“5”, 5);

assert(!smap.insert(“5”, 6).second);

// str长度不可超过255

smap[“aabbccddeeffgghhiijjkk=====================================”] = 10;

for (typename SMT::iterator i = smap.begin(); i != smap.end(); ++i)

{

printf(“smap[%s]=%d, klen=%d/n”, smap_t::key(i), *i, smap_t::klen(i));

}

在简单的情况下，使用trbstrmap,比std::map可以节省50%~70%的内存。使用的方便程度，基本上没差别。

数据结构

Mapped Data 是模板参数，可以是任意struct/class，当然也可以内嵌指针，也可以是复杂对象，没有任何限制。

trbstrmap Node Structure

复杂情形：任意字符串，字符串中可以包含/0

trbstrmap允许加入内容中存在/0的字符串，这种应用比较罕见，但是仍然允许，只是必须遵循以下注意事项：

1. 在这种情况下，需要传入key_content_length参数：
trbstrmap.probe(key,key_content_length)
trbstrmap.insert(key,key_content_length, val)
key_content_length是整个key_content长度，包括末尾的/0——如果有的话

2. 或者，传入std::string ：
trbstrmap[std::string(…)] = val;
equal to: trbstrmap.probe(s.c_str(), s.size()+1) = val;

3. trbstrmap.insert(key, klen, val)

4. 这种情况下，应用一般需要传入一个自定义的Compare

5. trbstrmap::klen(iter)在任何情况下，返回的都是key_content_length – 1，
而不管末尾有没有/0

6. 自定义函数需要获取key_content_length时，应该如下：

int

trb_compare_custom ( const struct trb_vtab* TRB_restrict vtab,

const struct trb_tree* TRB_restrict tree,

const void* TRB_restrict x,

const void* TRB_restrict y)

{

size_t lx = ((unsigned char*)x)[-1] + 1;

size_t ly = ((unsigned char*)y)[-1] + 1;

int ret = memcmp(x, y, lx<ly?lx:ly); // 仅示例，不一定非用memcmp

return 0 == ret ? lx – ly : ret;

}

7. 当给trbstrmap::find/lower_bound/equal_range多传入一个key_content_length时，这些函数会将key和key_content_length打包（这些都在库内部完成），应用须知这会有稍微多一些的开销。

必须这么做，因为compare函数需要找到key_content_length，除此之外，别无他法。

8. 貌似很复杂，但始终坚持剃刀原理：在绝大多数情况下，也就是字符串是cstr，末尾包含/0时，函数的行为符合传统惯例并且高效；在罕见情况下也可用，只是效率有一点下降，使用难度也大一些。

9. 当情况复杂时，记住：只要你明确传入了长度，这个长度就是你真实数据的长度

代码：http://code.google.com/p/febird/source/browse/trunk/febird/src/febird/trb_cxx.h

http://code.google.com/p/febird/source/browse/trunk/febird/src/febird/trb_cxx.cpp

测试代码：

http://code.google.com/p/febird/source/browse/trunk/febird/vcproj/test_trb/test_trb_cxx.cpp

memory pool 的高效实现（代码）

作者: rockeet 发表日期: 2009年10月23日分类: C++ 评论: 2 条阅读次数: 2,855 次

mpool.h

#ifndef __febird_c_mpool_h__
#define __febird_c_mpool_h__

#ifdef __cplusplus
extern "C" {
#endif

//------------------------------------------------------------------------------------------
typedef void* (*chunk_alloc_ft)(struct chunk_allocator* self, size_t size);
typedef void* (*chunk_realloc_ft)(struct chunk_allocator* self, void* block, size_t olc_size, size_t new_size);
typedef void  (*chunk_free_ft)(struct chunk_allocator* self, void* block, size_t size);

struct chunk_allocator
{
	void* (*chunk_alloc)(struct chunk_allocator* self, size_t size);
	void* (*chunk_realloc)(struct chunk_allocator* self, void* block, size_t olc_size, size_t new_size);
	void  (*chunk_free)(struct chunk_allocator* self, void* block, size_t size);
};

struct fixed_mpool
{
	struct chunk_allocator* ca;
	struct mpool_chunk* pChunks;
	struct mpool_cell*  head;
	size_t iNextChunk;
	size_t nChunks;
	size_t cell_size;
	size_t chunk_size;
	size_t used_cells;
//	size_t free_cells;
};

// 是否允许 mpool 分配超出 max_cell_size 的内存块
//#define FEBIRD_MPOOL_ALLOW_BIG_BLOCK
#ifdef FEBIRD_MPOOL_ALLOW_BIG_BLOCK
struct big_block_header
{
	struct big_block_header *next, *prev;
};
#endif

struct mpool
{
//------------------------------------------------------------------------------------------
/// export a chunk_allocator interface
	void* (*chunk_alloc)(struct chunk_allocator* self, size_t size);
	void* (*chunk_realloc)(struct chunk_allocator* self, void* block, size_t olc_size, size_t new_size);
	void  (*chunk_free)(struct chunk_allocator* self, void* block, size_t size);
//------------------------------------------------------------------------------------------

	/// chunk_allocator for this mpool self
	struct chunk_allocator* ca;

	struct fixed_mpool* fixed;
	size_t max_cell_size;
	size_t chunk_size;
#ifdef FEBIRD_MPOOL_ALLOW_BIG_BLOCK
	size_t big_blocks; // size > max_cell_size, use malloc, this is rare case
	struct big_block_header big_list;
#endif
};

void fixed_mpool_init(struct fixed_mpool* mpf);
void fixed_mpool_destroy(struct fixed_mpool* mpf);

void* fixed_mpool_alloc(struct fixed_mpool* mpf);
void fixed_mpool_free(struct fixed_mpool* mpf, void* ptr);


void mpool_init(struct mpool* mp);
void mpool_destroy(struct mpool* mp);

void* mpool_alloc(struct mpool* mp, size_t size);
void mpool_free(struct mpool* mp, void* ptr, size_t size);

size_t mpool_used_cells(const struct mpool* mp);
size_t mpool_used_bytes(const struct mpool* mp);

#ifdef __cplusplus
}
#endif

#endif // __febird_c_mpool_h__

#ifndef __febird_c_mpool_h__

#define __febird_c_mpool_h__

#ifdef __cplusplus

extern "C" {

#endif

//------------------------------------------------------------------------------------------

typedef void* (*chunk_alloc_ft)(struct chunk_allocator* self, size_t size);

typedef void* (*chunk_realloc_ft)(struct chunk_allocator* self, void* block, size_t olc_size, size_t new_size);

typedef void (*chunk_free_ft)(struct chunk_allocator* self, void* block, size_t size);

struct chunk_allocator

{

void* (*chunk_alloc)(struct chunk_allocator* self, size_t size);

void* (*chunk_realloc)(struct chunk_allocator* self, void* block, size_t olc_size, size_t new_size);

void (*chunk_free)(struct chunk_allocator* self, void* block, size_t size);

};

struct fixed_mpool

{

struct chunk_allocator* ca;

struct mpool_chunk* pChunks;

struct mpool_cell* head;

size_t iNextChunk;

size_t nChunks;

size_t cell_size;

size_t chunk_size;

size_t used_cells;

// size_t free_cells;

};

// 是否允许 mpool 分配超出 max_cell_size 的内存块

//#define FEBIRD_MPOOL_ALLOW_BIG_BLOCK

#ifdef FEBIRD_MPOOL_ALLOW_BIG_BLOCK

struct big_block_header

{

struct big_block_header *next, *prev;

};

#endif

struct mpool

{

//------------------------------------------------------------------------------------------

/// export a chunk_allocator interface

void* (*chunk_alloc)(struct chunk_allocator* self, size_t size);

void* (*chunk_realloc)(struct chunk_allocator* self, void* block, size_t olc_size, size_t new_size);

void (*chunk_free)(struct chunk_allocator* self, void* block, size_t size);

//------------------------------------------------------------------------------------------

/// chunk_allocator for this mpool self

struct chunk_allocator* ca;

struct fixed_mpool* fixed;

size_t max_cell_size;

size_t chunk_size;

#ifdef FEBIRD_MPOOL_ALLOW_BIG_BLOCK

size_t big_blocks; // size > max_cell_size, use malloc, this is rare case

struct big_block_header big_list;

#endif

};

void fixed_mpool_init(struct fixed_mpool* mpf);

void fixed_mpool_destroy(struct fixed_mpool* mpf);

void* fixed_mpool_alloc(struct fixed_mpool* mpf);

void fixed_mpool_free(struct fixed_mpool* mpf, void* ptr);

void mpool_init(struct mpool* mp);

void mpool_destroy(struct mpool* mp);

void* mpool_alloc(struct mpool* mp, size_t size);

void mpool_free(struct mpool* mp, void* ptr, size_t size);

size_t mpool_used_cells(const struct mpool* mp);

size_t mpool_used_bytes(const struct mpool* mp);

#ifdef __cplusplus

}

#endif

#endif // __febird_c_mpool_h__

mpool.c

#include "stdafx.h"
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>

#include "mpool.h"

/* must be power of 2 and greater than sizeof(void*) */
#define MPOOL_MIN_CELL 8

#define MPOOL_MIN_CHUNK 256

struct mpool_cell
{
	struct mpool_cell* next;
};

struct mpool_chunk
{
	struct mpool_cell* cell; // cell array
	size_t size; // size in bytes;
};

/**********************************************************************************/
/**
 * chunk_allocator use malloc/free
 */
static void*
chunk_malloc_alloc(struct chunk_allocator* , size_t size)
{
	return malloc(size);
}

static void 
chunk_malloc_free(struct chunk_allocator* , void* block, size_t size)
{
	return free(block);
}

static void*
chunk_malloc_realloc(struct chunk_allocator* , void* block, size_t old_size, size_t new_size)
{
	return realloc(block, new_size);
}
/**********************************************************************************/

void* 
default_chunk_realloc(struct chunk_allocator* ca, void* ptr, size_t old_size, size_t new_size)
{
	void* q = ca->chunk_alloc(ca, new_size);
	assert(old_size > 0);
	assert(new_size > 0);
	if (NULL == q) return NULL;
	memcpy(q, ptr, old_size < new_size ? old_size : new_size);
	ca->chunk_free(ca, ptr, old_size);
	return q;
}

static void 
mpool_chunk_init(struct mpool_cell* cell, size_t cell_count, size_t cell_size)
{
	size_t i;
	struct mpool_cell* p = cell;
	assert(cell_size % MPOOL_MIN_CELL == 0);
	for (i = 0; i < cell_count-1; ++i)
		p = p->next = (struct mpool_cell*)((char*)p + cell_size);
	p->next = NULL;
}

/**********************************************************************************/
static struct chunk_allocator fal = {
	&chunk_malloc_alloc,
	&chunk_malloc_realloc,
	&chunk_malloc_free
};

/**********************************************************************************/
/**
 * require initialized fields:
 *	 cell_size
 *	 chunk_size
 *	 ca [0 OR initialized]
 */
void fixed_mpool_init(struct fixed_mpool* mpf)
{
	struct chunk_allocator* al;
	if (NULL == mpf->ca)
		al = mpf->ca = &fal;
	else {
		al = mpf->ca;
		assert(NULL != al->chunk_alloc);
		assert(NULL != al->chunk_free);
		if (NULL == al->chunk_realloc)
			al->chunk_realloc = &default_chunk_realloc;
	}
	assert(mpf->chunk_size > 0);
	assert(mpf->cell_size > 0);
	assert(mpf->cell_size < mpf->chunk_size);

	mpf->cell_size = (mpf->cell_size + MPOOL_MIN_CELL - 1) / MPOOL_MIN_CELL * MPOOL_MIN_CELL;
	mpf->chunk_size = (mpf->chunk_size + MPOOL_MIN_CHUNK - 1) / MPOOL_MIN_CHUNK * MPOOL_MIN_CHUNK;

	mpf->head = NULL;

	if (mpf->nChunks < MPOOL_MIN_CHUNK/sizeof(struct mpool_chunk))
		mpf->nChunks = MPOOL_MIN_CHUNK/sizeof(struct mpool_chunk);

	mpf->iNextChunk = 0;
	mpf->pChunks = (struct mpool_chunk*)
		al->chunk_alloc(al, sizeof(struct mpool_chunk) * mpf->nChunks);
	if (NULL == mpf->pChunks) {
		fprintf(stderr
			, "fatal: febird.fixed_mpool_init failed, chunk[size=%zd, count=%zd]/n"
			, mpf->chunk_size, mpf->nChunks);
		abort();
	}
	mpf->used_cells = 0;
}

void fixed_mpool_destroy(struct fixed_mpool* mpf)
{
	struct chunk_allocator* ca = mpf->ca;
	long i;
	for (i = mpf->iNextChunk - 1; i >= 0; --i)
		ca->chunk_free(ca, mpf->pChunks[i].cell, mpf->pChunks[i].size);

	ca->chunk_free(ca, mpf->pChunks, sizeof(struct mpool_chunk) * mpf->nChunks);
}

// 0 success, -1 fail
int fixed_mpool_add_chunk(struct fixed_mpool* mpf)
{
	struct mpool_cell* cell;
	
	assert(mpf->pChunks != NULL);

	if (mpf->iNextChunk == mpf->nChunks) {
		size_t old_size = sizeof(struct mpool_chunk) * mpf->nChunks;
		size_t new_size = 2 * old_size;

		// allocate new chunk array
		struct mpool_chunk* c = (struct mpool_chunk*)
			mpf->ca->chunk_realloc(mpf->ca, mpf->pChunks, old_size, new_size);

		if (NULL == c) return -1;
		mpf->pChunks = c;
		mpf->nChunks *= 2;     // chunk array expanded 2
		mpf->chunk_size *= 2;  // chunk_size  expanded 2 also
	}

	// allocate a new cell array
	cell = (struct mpool_cell*)mpf->ca->chunk_alloc(mpf->ca, mpf->chunk_size);
	if (NULL == cell) return -1;
	mpf->pChunks[mpf->iNextChunk].cell = cell;
	mpf->pChunks[mpf->iNextChunk].size = mpf->chunk_size;
	mpf->iNextChunk++;
	mpool_chunk_init(cell, mpf->chunk_size / mpf->cell_size, mpf->cell_size);
	mpf->head = cell;
	
	return 0;
}

void* fixed_mpool_alloc(struct fixed_mpool* mpf)
{
	struct mpool_cell* cell;

	if (NULL == mpf->head) {
		// in most case it will not run this path

		if (fixed_mpool_add_chunk(mpf) != 0)
			return NULL;
	}
	cell = mpf->head;
	mpf->used_cells++;
	mpf->head = mpf->head->next;
	return cell;
}

void fixed_mpool_free(struct fixed_mpool* mpf, void* ptr)
{
	struct mpool_cell* cell = (struct mpool_cell*)ptr;
	cell->next = mpf->head;
	mpf->used_cells--;
	mpf->head = cell;
}

/**********************************************************************************/
/**
 * require initialized fields:
 *   max_cell_size
 *	 chunk_size
 *	 ca [0 OR initialized]
 */
void mpool_init(struct mpool* mp)
{
	int i, nFixed;
	struct chunk_allocator* al;
	
	assert(mp->max_cell_size < mp->chunk_size);

	if (NULL == mp->chunk_alloc)
		al = mp->ca = &fal;
	else {
		al = mp->ca;
		assert(NULL != al->chunk_alloc);
		assert(NULL != al->chunk_free);
		if (NULL == al->chunk_realloc)
			al->chunk_realloc = &default_chunk_realloc;
	}
	mp->chunk_alloc = (chunk_alloc_ft)&mpool_alloc;
	mp->chunk_free = (chunk_free_ft)&mpool_free;
	mp->chunk_realloc = (chunk_realloc_ft)&default_chunk_realloc;

	mp->max_cell_size = (mp->max_cell_size + MPOOL_MIN_CELL - 1) / MPOOL_MIN_CELL * MPOOL_MIN_CELL;
	mp->chunk_size = (mp->chunk_size + MPOOL_MIN_CHUNK - 1) / MPOOL_MIN_CHUNK * MPOOL_MIN_CHUNK;
	nFixed = mp->max_cell_size / MPOOL_MIN_CELL;

	mp->fixed = (struct fixed_mpool*)al->chunk_alloc(al, sizeof(struct fixed_mpool) * nFixed);
	if (NULL == mp->fixed) {
		fprintf(stderr, "fatal: febird.mpool_init[max_cell_size=%zd, chunk_size=%zd]/n"
				, mp->max_cell_size, mp->chunk_size);
		abort();
	}

	for (i = 0; i < nFixed; ++i)
	{
		mp->fixed[i].cell_size = (i + 1) * MPOOL_MIN_CELL;
		mp->fixed[i].chunk_size = mp->chunk_size;
		mp->fixed[i].nChunks = 16;
		mp->fixed[i].ca = mp->ca;
		fixed_mpool_init(&mp->fixed[i]);
	}
#ifdef FEBIRD_MPOOL_ALLOW_BIG_BLOCK
	mp->big_blocks = 0;
	mp->big_list.prev = mp->big_list.next = &mp->big_list;
#endif
}

void mpool_destroy(struct mpool* mp)
{
	size_t i, nFixed;
#ifdef FEBIRD_MPOOL_ALLOW_BIG_BLOCK
	struct big_block_header *p;
	for (i = 0, p = mp->big_list.next; p != &mp->big_list; ++i)
	{
		struct big_block_header *q;
		if (i > mp->big_blocks) {
			fprintf(stderr, "fatal: febird.mpool_destroy/n");
			abort();
		}
		q = p->next;
		free(p);
		p = q;
	}
	assert(i == mp->big_blocks);
#endif
	nFixed = mp->max_cell_size / MPOOL_MIN_CELL;
	for (i = 0; i < nFixed; ++i)
	{
		fixed_mpool_destroy(&mp->fixed[i]);
	}
	mp->ca->chunk_free(mp->ca, mp->fixed, sizeof(struct fixed_mpool) * nFixed);
}

void* mpool_alloc(struct mpool* mp, size_t size)
{
	size_t idx;
	struct fixed_mpool* mpf;
	struct mpool_cell* cell;

#ifdef FEBIRD_MPOOL_ALLOW_BIG_BLOCK
	if (size > mp->max_cell_size) {
		// this is rare case
		struct big_block_header *p, *h;
		p = (struct big_block_header*)malloc(sizeof(struct big_block_header) + size);
		if (p) {
			h = &mp->big_list;
			p->prev = h;
			p->next = h->next;
			h->next->prev = p;
			h->next = p;
			mp->big_blocks++;
			return (p + 1);
		} else
			return NULL;
	}
#else
	assert(size <= mp->max_cell_size);
#endif
	assert(size > 0);

	idx = (size - 1) / MPOOL_MIN_CELL;
	mpf = &mp->fixed[idx];

// same as fixed_mpool_alloc....	
	if (NULL == mpf->head) {
		// in most case it will not run this path
		
		if (fixed_mpool_add_chunk(mpf) != 0)
			return NULL;
	}
	cell = mpf->head;
	mpf->used_cells++;
	mpf->head = mpf->head->next;
	return cell;
}

void mpool_free(struct mpool* mp, void* ptr, size_t size)
{
	size_t idx;
	struct fixed_mpool* mpf;
	struct mpool_cell* cell;

#ifdef FEBIRD_MPOOL_ALLOW_BIG_BLOCK
	if (size > mp->max_cell_size) {
		// this is rare case
		struct big_block_header* bbh = (struct big_block_header*)ptr - 1;
		bbh->prev->next = bbh->next;
		bbh->next->prev = bbh->prev;
		free(bbh);
		mp->big_blocks--;
		return;
	}
#else
	assert(size <= mp->max_cell_size);
#endif
	assert(size > 0);

	idx = (size - 1) / MPOOL_MIN_CELL;
	mpf = &mp->fixed[idx];

// same as fixed_mpool_free...
	cell = (struct mpool_cell*)ptr;
	cell->next = mpf->head;
	mpf->used_cells--;
	mpf->head = cell;
}

size_t mpool_used_cells(const struct mpool* mp)
{
	size_t i, n = mp->max_cell_size / MPOOL_MIN_CELL;
	size_t used = 0;
	for (i = 0; i < n; ++i)
		used += mp->fixed[i].used_cells;
	return used;
}

size_t mpool_used_bytes(const struct mpool* mp)
{
	size_t i, n = mp->max_cell_size / MPOOL_MIN_CELL;
	size_t used = 0;
	for (i = 0; i < n; ++i)
		used += mp->fixed[i].cell_size * mp->fixed[i].used_cells;
	return used;
}

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

#include "stdafx.h"

#include <stdio.h>

#include <stdlib.h>

#include <string.h>

#include <assert.h>

#include "mpool.h"

/* must be power of 2 and greater than sizeof(void*) */

#define MPOOL_MIN_CELL 8

#define MPOOL_MIN_CHUNK 256

struct mpool_cell

{

struct mpool_cell* next;

};

struct mpool_chunk

{

struct mpool_cell* cell; // cell array

size_t size; // size in bytes;

};

/**********************************************************************************/

/**

* chunk_allocator use malloc/free

static void*

chunk_malloc_alloc(struct chunk_allocator* , size_t size)

{

return malloc(size);

}

static void

chunk_malloc_free(struct chunk_allocator* , void* block, size_t size)

{

return free(block);

}

static void*

chunk_malloc_realloc(struct chunk_allocator* , void* block, size_t old_size, size_t new_size)

{

return realloc(block, new_size);

}

/**********************************************************************************/

void*

default_chunk_realloc(struct chunk_allocator* ca, void* ptr, size_t old_size, size_t new_size)

{

void* q = ca->chunk_alloc(ca, new_size);

assert(old_size > 0);

assert(new_size > 0);

if (NULL == q) return NULL;

memcpy(q, ptr, old_size < new_size ? old_size : new_size);

ca->chunk_free(ca, ptr, old_size);

return q;

}

static void

mpool_chunk_init(struct mpool_cell* cell, size_t cell_count, size_t cell_size)

{

size_t i;

struct mpool_cell* p = cell;

assert(cell_size % MPOOL_MIN_CELL == 0);

for (i = 0; i < cell_count-1; ++i)

p = p->next = (struct mpool_cell*)((char*)p + cell_size);

p->next = NULL;

}

/**********************************************************************************/

static struct chunk_allocator fal = {

&chunk_malloc_alloc,

&chunk_malloc_realloc,

&chunk_malloc_free

};

/**********************************************************************************/

/**

* require initialized fields:

* cell_size

* chunk_size

* ca [0 OR initialized]

void fixed_mpool_init(struct fixed_mpool* mpf)

{

struct chunk_allocator* al;

if (NULL == mpf->ca)

al = mpf->ca = &fal;

else {

al = mpf->ca;

assert(NULL != al->chunk_alloc);

assert(NULL != al->chunk_free);

if (NULL == al->chunk_realloc)

al->chunk_realloc = &default_chunk_realloc;

}

assert(mpf->chunk_size > 0);

assert(mpf->cell_size > 0);

assert(mpf->cell_size < mpf->chunk_size);

mpf->cell_size = (mpf->cell_size + MPOOL_MIN_CELL - 1) / MPOOL_MIN_CELL * MPOOL_MIN_CELL;

mpf->chunk_size = (mpf->chunk_size + MPOOL_MIN_CHUNK - 1) / MPOOL_MIN_CHUNK * MPOOL_MIN_CHUNK;

mpf->head = NULL;

if (mpf->nChunks < MPOOL_MIN_CHUNK/sizeof(struct mpool_chunk))

mpf->nChunks = MPOOL_MIN_CHUNK/sizeof(struct mpool_chunk);

mpf->iNextChunk = 0;

mpf->pChunks = (struct mpool_chunk*)

al->chunk_alloc(al, sizeof(struct mpool_chunk) * mpf->nChunks);

if (NULL == mpf->pChunks) {

fprintf(stderr

, "fatal: febird.fixed_mpool_init failed, chunk[size=%zd, count=%zd]/n"

, mpf->chunk_size, mpf->nChunks);

abort();

}

mpf->used_cells = 0;

}

void fixed_mpool_destroy(struct fixed_mpool* mpf)

{

struct chunk_allocator* ca = mpf->ca;

long i;

for (i = mpf->iNextChunk - 1; i >= 0; --i)

ca->chunk_free(ca, mpf->pChunks[i].cell, mpf->pChunks[i].size);

ca->chunk_free(ca, mpf->pChunks, sizeof(struct mpool_chunk) * mpf->nChunks);

}

// 0 success, -1 fail

int fixed_mpool_add_chunk(struct fixed_mpool* mpf)

{

struct mpool_cell* cell;

assert(mpf->pChunks != NULL);

if (mpf->iNextChunk == mpf->nChunks) {

size_t old_size = sizeof(struct mpool_chunk) * mpf->nChunks;

size_t new_size = 2 * old_size;

// allocate new chunk array

struct mpool_chunk* c = (struct mpool_chunk*)

mpf->ca->chunk_realloc(mpf->ca, mpf->pChunks, old_size, new_size);

if (NULL == c) return -1;

mpf->pChunks = c;

mpf->nChunks *= 2; // chunk array expanded 2

mpf->chunk_size *= 2; // chunk_size expanded 2 also

}

// allocate a new cell array

cell = (struct mpool_cell*)mpf->ca->chunk_alloc(mpf->ca, mpf->chunk_size);

if (NULL == cell) return -1;

mpf->pChunks[mpf->iNextChunk].cell = cell;

mpf->pChunks[mpf->iNextChunk].size = mpf->chunk_size;

mpf->iNextChunk++;

mpool_chunk_init(cell, mpf->chunk_size / mpf->cell_size, mpf->cell_size);

mpf->head = cell;

return 0;

}

void* fixed_mpool_alloc(struct fixed_mpool* mpf)

{

struct mpool_cell* cell;

if (NULL == mpf->head) {

// in most case it will not run this path

if (fixed_mpool_add_chunk(mpf) != 0)

return NULL;

}

cell = mpf->head;

mpf->used_cells++;

mpf->head = mpf->head->next;

return cell;

}

void fixed_mpool_free(struct fixed_mpool* mpf, void* ptr)

{

struct mpool_cell* cell = (struct mpool_cell*)ptr;

cell->next = mpf->head;

mpf->used_cells--;

mpf->head = cell;

}

/**********************************************************************************/

/**

* require initialized fields:

* max_cell_size

* chunk_size

* ca [0 OR initialized]

void mpool_init(struct mpool* mp)

{

int i, nFixed;

struct chunk_allocator* al;

assert(mp->max_cell_size < mp->chunk_size);

if (NULL == mp->chunk_alloc)

al = mp->ca = &fal;

else {

al = mp->ca;

assert(NULL != al->chunk_alloc);

assert(NULL != al->chunk_free);

if (NULL == al->chunk_realloc)

al->chunk_realloc = &default_chunk_realloc;

}

mp->chunk_alloc = (chunk_alloc_ft)&mpool_alloc;

mp->chunk_free = (chunk_free_ft)&mpool_free;

mp->chunk_realloc = (chunk_realloc_ft)&default_chunk_realloc;

mp->max_cell_size = (mp->max_cell_size + MPOOL_MIN_CELL - 1) / MPOOL_MIN_CELL * MPOOL_MIN_CELL;

mp->chunk_size = (mp->chunk_size + MPOOL_MIN_CHUNK - 1) / MPOOL_MIN_CHUNK * MPOOL_MIN_CHUNK;

nFixed = mp->max_cell_size / MPOOL_MIN_CELL;

mp->fixed = (struct fixed_mpool*)al->chunk_alloc(al, sizeof(struct fixed_mpool) * nFixed);

if (NULL == mp->fixed) {

fprintf(stderr, "fatal: febird.mpool_init[max_cell_size=%zd, chunk_size=%zd]/n"

, mp->max_cell_size, mp->chunk_size);

abort();

}

for (i = 0; i < nFixed; ++i)

{

mp->fixed[i].cell_size = (i + 1) * MPOOL_MIN_CELL;

mp->fixed[i].chunk_size = mp->chunk_size;

mp->fixed[i].nChunks = 16;

mp->fixed[i].ca = mp->ca;

fixed_mpool_init(&mp->fixed[i]);

}

#ifdef FEBIRD_MPOOL_ALLOW_BIG_BLOCK

mp->big_blocks = 0;

mp->big_list.prev = mp->big_list.next = &mp->big_list;

#endif

}

void mpool_destroy(struct mpool* mp)

{

size_t i, nFixed;

#ifdef FEBIRD_MPOOL_ALLOW_BIG_BLOCK

struct big_block_header *p;

for (i = 0, p = mp->big_list.next; p != &mp->big_list; ++i)

{

struct big_block_header *q;

if (i > mp->big_blocks) {

fprintf(stderr, "fatal: febird.mpool_destroy/n");

abort();

}

q = p->next;

free(p);

p = q;

}

assert(i == mp->big_blocks);

#endif

nFixed = mp->max_cell_size / MPOOL_MIN_CELL;

for (i = 0; i < nFixed; ++i)

{

fixed_mpool_destroy(&mp->fixed[i]);

}

mp->ca->chunk_free(mp->ca, mp->fixed, sizeof(struct fixed_mpool) * nFixed);

}

void* mpool_alloc(struct mpool* mp, size_t size)

{

size_t idx;

struct fixed_mpool* mpf;

struct mpool_cell* cell;

#ifdef FEBIRD_MPOOL_ALLOW_BIG_BLOCK

if (size > mp->max_cell_size) {

// this is rare case

struct big_block_header *p, *h;

p = (struct big_block_header*)malloc(sizeof(struct big_block_header) + size);

if (p) {

h = &mp->big_list;

p->prev = h;

p->next = h->next;

h->next->prev = p;

h->next = p;

mp->big_blocks++;

return (p + 1);

} else

return NULL;

}

#else

assert(size <= mp->max_cell_size);

#endif

assert(size > 0);

idx = (size - 1) / MPOOL_MIN_CELL;

mpf = &mp->fixed[idx];

// same as fixed_mpool_alloc....

if (NULL == mpf->head) {

// in most case it will not run this path

if (fixed_mpool_add_chunk(mpf) != 0)

return NULL;

}

cell = mpf->head;

mpf->used_cells++;

mpf->head = mpf->head->next;

return cell;

}

void mpool_free(struct mpool* mp, void* ptr, size_t size)

{

size_t idx;

struct fixed_mpool* mpf;

struct mpool_cell* cell;

#ifdef FEBIRD_MPOOL_ALLOW_BIG_BLOCK

if (size > mp->max_cell_size) {

// this is rare case

struct big_block_header* bbh = (struct big_block_header*)ptr - 1;

bbh->prev->next = bbh->next;

bbh->next->prev = bbh->prev;

free(bbh);

mp->big_blocks--;

return;

}

#else

assert(size <= mp->max_cell_size);

#endif

assert(size > 0);

idx = (size - 1) / MPOOL_MIN_CELL;

mpf = &mp->fixed[idx];

// same as fixed_mpool_free...

cell = (struct mpool_cell*)ptr;

cell->next = mpf->head;

mpf->used_cells--;

mpf->head = cell;

}

size_t mpool_used_cells(const struct mpool* mp)

{

size_t i, n = mp->max_cell_size / MPOOL_MIN_CELL;

size_t used = 0;

for (i = 0; i < n; ++i)

used += mp->fixed[i].used_cells;

return used;

}

size_t mpool_used_bytes(const struct mpool* mp)

{

size_t i, n = mp->max_cell_size / MPOOL_MIN_CELL;

size_t used = 0;

for (i = 0; i < n; ++i)

used += mp->fixed[i].cell_size * mp->fixed[i].used_cells;

return used;

}

memory pool 的高效实现

作者: rockeet 发表日期: 2009年10月23日分类: C++ 评论: 0 条阅读次数: 2,588 次

memory pool

malloc可以分配任意大小的内存，因此，在malloc内部，保存了一些簿记信息（至少有一个包含内存块尺寸的信息）。调用free时，可以正确释放。

为了减少这些簿记开销，可以使用memory pool。

根据使用情境，可以分为两种：

1. 只分配固定大小的内存块，速度最快（normal path约10条机器指令）。

2. 可分配不同大小的内存块，速度稍慢，但比malloc快得多，也无簿记开销。

以下将分别说明

mpool

mpool可分配不同尺寸的内存。大多数时刻，都在内部分配。

销毁mpool时，会自动释放在mpool中未释放的内存。

mpool内部有一个包含多个不同尺寸fixed_mpool的array，根据请求分配的内存大小，直接索引到相应的fixed_mpool来分配一个单元（cell）。

通过定义宏FEBIRD_MPOOL_ALLOW_BIG_BLOCK，就允许大于max_cell_size的内存分配。在这种情况下，使用标准的malloc分配内存，分配出去的内存有额外簿记（用双向链表串起来），以便在销毁mpool时自动释放。

mpool graph

fixed_mpool

尺寸固定的内存池，一旦创建，该内存池只能分配固定尺寸的内存单元（cell）。这在很多情况下都适用，例如链表结点、树节点、图结点、自定义的结构、等等。

用于stl的map/set/list再适合不过了——但是不能用于vector/deque等需要分配可变尺寸的容器。

fixed_mpool内部的多个chunk使用数组，类似std::vector，iNextChunk相当于vector.size，nChunks相当于vector.capacity，每次空间不够时扩张一倍。使用数组，而不是链表，有以下好处：

1. 有助于对齐——如果chunk_allocator是对齐（对齐>=32时）分配的，而chunk使用链表组织，就会在chunk开始处预留一个chunk头部，这会导致不对齐。

2. 如果cell_size也刚好较大且整除chunk_size，使用链表就会浪费将近一个cell（cell_size – chunk_header_size）。

3. 如果不把连接信息保存在chunk header，就需要另外分配chunk结点，而分配chunk结点又需要其它内存分配函数。

空闲表使用单链表，因此，理论上每个cell最小必须能容纳一个指针，32位系统式4字节，64位系统式8字节，实现中使用8字节作为最小值。

google.tcmalloc 要点

线程缓存(thread cache)

加锁(locking)

google.sparsetable 实现细节

gcc 4.5

my PipelineProcessor

简单的代码生成器创建领域语言

字符串基数排序

rdtsc 备忘

内嵌变长数据结构范例——trbstrmap

使用

大多数情况下

数据结构

复杂情形：任意字符串，字符串中可以包含/0

memory pool 的高效实现（代码）

memory pool 的高效实现

memory pool

mpool

近期文章

近期评论

文章归档

分类目录

功能