<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>
<channel>
	<title>超群.com的博客 &#187; 实时索引</title>
	<atom:link href="http://www.fuchaoqun.com/tag/%e5%ae%9e%e6%97%b6%e7%b4%a2%e5%bc%95/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.fuchaoqun.com</link>
	<description></description>
	<lastBuildDate>Thu, 08 Sep 2011 15:08:19 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>利用Sphinx实现实时全文检索</title>
		<link>http://www.fuchaoqun.com/2010/06/sphinx-real-time-index/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=sphinx-real-time-index</link>
		<comments>http://www.fuchaoqun.com/2010/06/sphinx-real-time-index/#comments</comments>
		<pubDate>Tue, 22 Jun 2010 12:13:53 +0000</pubDate>
		<dc:creator>超群.com</dc:creator>
				<category><![CDATA[Full Text Search]]></category>
		<category><![CDATA[on demand index]]></category>
		<category><![CDATA[real-time index]]></category>
		<category><![CDATA[sphinx]]></category>
		<category><![CDATA[实时索引]]></category>
		<guid isPermaLink="false">http://www.fuchaoqun.com/?p=357</guid>
		<description><![CDATA[Sphinx 0.9.9及以前的版本，原生不支持实时索引，一般的做法是通过主索引+增量索引的方式来实现“准实时”索引，最新的1.10.1（trunk中，尚未发布）终于支持real-time index，查看SVN中文档，我们很容易利用Sphinx搭建一个按需索引(on demand index)的全文检索系统。 参考文章：http://filiptepper.com/2010/05/27/real-time-indexing-and-searching-with-sphinx-1-10-1-dev.html 首先，从sphinxsearch的SVN下载最新的代码，编译安装： svn checkout http://sphinxsearch.googlecode.com/svn/trunk sphinx cd sphinx/ ./configure --prefix=/path/to/sphinx make make install 编译没问题的话，在sphinx安装目录下的etc，建立sphinx.conf的配置文件，记得一定指定中文编码方面的配置搜索，否则中文会有问题： index rt { # 指定索引类型为real-time index type = rt # 指定utf-8编码 charset_type = utf-8 # 指定utf-8的编码表 charset_table = 0..9, A..Z-&#62;a..z, _, a..z, U+410..U+42F-&#62;U+430..U+44F, U+430..U+44F # 一元分词 ngram_len = 1 # 需要分词的字符 ngram_chars = U+3000..U+2FA1F # 索引文件保存地址 [...]]]></description>
			<content:encoded><![CDATA[<p>Sphinx 0.9.9及以前的版本，原生不支持实时索引，一般的做法是通过主索引+增量索引的方式来实现“准实时”索引，最新的1.10.1（trunk中，尚未发布）终于支持real-time index，查看SVN中文档，我们很容易利用Sphinx搭建一个按需索引(on demand index)的全文检索系统。</p>
<p>参考文章：<a href="http://filiptepper.com/2010/05/27/real-time-indexing-and-searching-with-sphinx-1-10-1-dev.html">http://filiptepper.com/2010/05/27/real-time-indexing-and-searching-with-sphinx-1-10-1-dev.html</a></p>
<p>首先，从sphinxsearch的SVN下载最新的代码，编译安装：</p>
<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">svn</span> checkout http:<span style="color: #000000; font-weight: bold;">//</span>sphinxsearch.googlecode.com<span style="color: #000000; font-weight: bold;">/</span>svn<span style="color: #000000; font-weight: bold;">/</span>trunk sphinx
<span style="color: #7a0874; font-weight: bold;">cd</span> sphinx<span style="color: #000000; font-weight: bold;">/</span>
.<span style="color: #000000; font-weight: bold;">/</span>configure <span style="color: #660033;">--prefix</span>=<span style="color: #000000; font-weight: bold;">/</span>path<span style="color: #000000; font-weight: bold;">/</span>to<span style="color: #000000; font-weight: bold;">/</span>sphinx
<span style="color: #c20cb9; font-weight: bold;">make</span>
<span style="color: #c20cb9; font-weight: bold;">make</span> <span style="color: #c20cb9; font-weight: bold;">install</span></pre></div></div>
<p>编译没问题的话，在sphinx安装目录下的etc，建立sphinx.conf的配置文件，记得一定指定中文编码方面的配置搜索，否则中文会有问题：</p>
<div class="wp_syntax"><div class="code"><pre class="text" style="font-family:monospace;">index rt {
    # 指定索引类型为real-time index
    type = rt
    # 指定utf-8编码
    charset_type  = utf-8
    # 指定utf-8的编码表
    charset_table  = 0..9, A..Z-&gt;a..z, _, a..z, U+410..U+42F-&gt;U+430..U+44F, U+430..U+44F
    # 一元分词
    ngram_len = 1
    # 需要分词的字符
    ngram_chars   = U+3000..U+2FA1F
    # 索引文件保存地址
    path = /path/to/sphinx/data/rt
    # 索引列
    rt_field = message
    # 索引属性
    rt_attr_uint = message_id
}
&nbsp;
searchd {
    log = /path/to/sphinx/log/searchd.log
    query_log = /path/to/sphinx/log/query.log
    pid_file = /path/to/sphinx/log/searchd.pid
    workers = threads
    # sphinx模拟mysql接口，不需要真正的mysql，mysql41表示支持mysql4.1~mysql5.1协议
    listen = 127.0.0.1:9527:mysql41
}</pre></div></div>
<p>启动sphinx服务：</p>
<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">/</span>path<span style="color: #000000; font-weight: bold;">/</span>to<span style="color: #000000; font-weight: bold;">/</span>sphinx<span style="color: #000000; font-weight: bold;">/</span>bin<span style="color: #000000; font-weight: bold;">/</span>searchd <span style="color: #660033;">--config</span> <span style="color: #000000; font-weight: bold;">/</span>path<span style="color: #000000; font-weight: bold;">/</span>to<span style="color: #000000; font-weight: bold;">/</span>sphinx<span style="color: #000000; font-weight: bold;">/</span>etc<span style="color: #000000; font-weight: bold;">/</span>sphinx.conf</pre></div></div>
<p>插入几条数据看看：</p>
<div class="wp_syntax"><div class="code"><pre class="mysql" style="font-family:monospace;">ubuntu:chaoqun <span style="color: #CC0099;">~</span>:mysql <span style="color: #CC0099;">-</span>h127.0.0.1 <span style="color: #CC0099;">-</span>P9527
Welcome <span style="color: #990099; font-weight: bold;">to</span> the MySQL monitor.  Commands <span style="color: #009900;">end</span> <span style="color: #990099; font-weight: bold;">with</span> <span style="color: #000033;">;</span> <span style="color: #CC0099; font-weight: bold;">or</span> \g.
Your MySQL <span style="color: #FF9900; font-weight: bold;">connection</span> id <span style="color: #CC0099; font-weight: bold;">is</span> <span style="color: #008080;">1</span>
Server <span style="color: #000099;">version</span>: 1.10.1<span style="color: #CC0099;">-</span>dev <span style="color: #FF00FF;">&#40;</span>r2351<span style="color: #FF00FF;">&#41;</span>
&nbsp;
<span style="color: #990099; font-weight: bold;">Type</span> <span style="color: #008000;">'help;'</span> <span style="color: #CC0099; font-weight: bold;">or</span> <span style="color: #008000;">'<span style="color: #004000; font-weight: bold;">\h</span>'</span> for <span style="color: #990099; font-weight: bold;">help</span>. <span style="color: #990099; font-weight: bold;">Type</span> <span style="color: #008000;">'<span style="color: #004000; font-weight: bold;">\c</span>'</span> <span style="color: #990099; font-weight: bold;">to</span> clear the current input statement.
&nbsp;
mysql<span style="color: #CC0099;">&gt;</span> <span style="color: #990099; font-weight: bold;">INSERT</span> <span style="color: #990099; font-weight: bold;">INTO</span> rt <span style="color: #990099; font-weight: bold;">VALUES</span> <span style="color: #FF00FF;">&#40;</span><span style="color: #008080;">1</span><span style="color: #000033;">,</span> <span style="color: #008000;">'this message has a body'</span><span style="color: #000033;">,</span> <span style="color: #008080;">1</span><span style="color: #FF00FF;">&#41;</span><span style="color: #000033;">;</span>
Query OK<span style="color: #000033;">,</span> <span style="color: #008080;">1</span> row affected <span style="color: #FF00FF;">&#40;</span><span style="color: #008080;">0.01</span> sec<span style="color: #FF00FF;">&#41;</span>
&nbsp;
mysql<span style="color: #CC0099;">&gt;</span> <span style="color: #990099; font-weight: bold;">INSERT</span> <span style="color: #990099; font-weight: bold;">INTO</span> rt <span style="color: #990099; font-weight: bold;">VALUES</span> <span style="color: #FF00FF;">&#40;</span><span style="color: #008080;">2</span><span style="color: #000033;">,</span> <span style="color: #008000;">'测试中文OK'</span><span style="color: #000033;">,</span> <span style="color: #008080;">2</span><span style="color: #FF00FF;">&#41;</span><span style="color: #000033;">;</span>
Query OK<span style="color: #000033;">,</span> <span style="color: #008080;">1</span> row affected <span style="color: #FF00FF;">&#40;</span><span style="color: #008080;">0.00</span> sec<span style="color: #FF00FF;">&#41;</span>
&nbsp;
mysql<span style="color: #CC0099;">&gt;</span></pre></div></div>
<p>测试全文检索：</p>
<div class="wp_syntax"><div class="code"><pre class="mysql" style="font-family:monospace;">mysql<span style="color: #CC0099;">&gt;</span> <span style="color: #990099; font-weight: bold;">SELECT</span> <span style="color: #CC0099;">*</span> <span style="color: #990099; font-weight: bold;">FROM</span> rt <span style="color: #990099; font-weight: bold;">WHERE</span> <span style="color: #990099; font-weight: bold;">MATCH</span><span style="color: #FF00FF;">&#40;</span><span style="color: #008000;">'message'</span><span style="color: #FF00FF;">&#41;</span><span style="color: #000033;">;</span>
<span style="color: #CC0099;">+------+--------+------------+</span>
<span style="color: #CC0099;">|</span> id   <span style="color: #CC0099;">|</span> weight <span style="color: #CC0099;">|</span> message_id <span style="color: #CC0099;">|</span>
<span style="color: #CC0099;">+------+--------+------------+</span>
<span style="color: #CC0099;">|</span>    <span style="color: #008080;">1</span> <span style="color: #CC0099;">|</span>   <span style="color: #008080;">1643</span> <span style="color: #CC0099;">|</span>          <span style="color: #008080;">1</span> <span style="color: #CC0099;">|</span>
<span style="color: #CC0099;">+------+--------+------------+</span>
<span style="color: #008080;">1</span> row <span style="color: #990099; font-weight: bold;">in</span> <span style="color: #990099; font-weight: bold;">set</span> <span style="color: #FF00FF;">&#40;</span><span style="color: #008080;">0.00</span> sec<span style="color: #FF00FF;">&#41;</span>
&nbsp;
mysql<span style="color: #CC0099;">&gt;</span> <span style="color: #990099; font-weight: bold;">SELECT</span> <span style="color: #CC0099;">*</span> <span style="color: #990099; font-weight: bold;">FROM</span> rt <span style="color: #990099; font-weight: bold;">WHERE</span> <span style="color: #990099; font-weight: bold;">MATCH</span><span style="color: #FF00FF;">&#40;</span><span style="color: #008000;">'OK'</span><span style="color: #FF00FF;">&#41;</span><span style="color: #000033;">;</span>
<span style="color: #CC0099;">+------+--------+------------+</span>
<span style="color: #CC0099;">|</span> id   <span style="color: #CC0099;">|</span> weight <span style="color: #CC0099;">|</span> message_id <span style="color: #CC0099;">|</span>
<span style="color: #CC0099;">+------+--------+------------+</span>
<span style="color: #CC0099;">|</span>    <span style="color: #008080;">2</span> <span style="color: #CC0099;">|</span>   <span style="color: #008080;">1643</span> <span style="color: #CC0099;">|</span>          <span style="color: #008080;">2</span> <span style="color: #CC0099;">|</span>
<span style="color: #CC0099;">+------+--------+------------+</span>
<span style="color: #008080;">1</span> row <span style="color: #990099; font-weight: bold;">in</span> <span style="color: #990099; font-weight: bold;">set</span> <span style="color: #FF00FF;">&#40;</span><span style="color: #008080;">0.01</span> sec<span style="color: #FF00FF;">&#41;</span>
&nbsp;
mysql<span style="color: #CC0099;">&gt;</span> <span style="color: #990099; font-weight: bold;">SELECT</span> <span style="color: #CC0099;">*</span> <span style="color: #990099; font-weight: bold;">FROM</span> rt <span style="color: #990099; font-weight: bold;">WHERE</span> <span style="color: #990099; font-weight: bold;">MATCH</span><span style="color: #FF00FF;">&#40;</span><span style="color: #008000;">'中'</span><span style="color: #FF00FF;">&#41;</span><span style="color: #000033;">;</span>
<span style="color: #CC0099;">+------+--------+------------+</span>
<span style="color: #CC0099;">|</span> id   <span style="color: #CC0099;">|</span> weight <span style="color: #CC0099;">|</span> message_id <span style="color: #CC0099;">|</span>
<span style="color: #CC0099;">+------+--------+------------+</span>
<span style="color: #CC0099;">|</span>    <span style="color: #008080;">2</span> <span style="color: #CC0099;">|</span>   <span style="color: #008080;">1643</span> <span style="color: #CC0099;">|</span>          <span style="color: #008080;">2</span> <span style="color: #CC0099;">|</span>
<span style="color: #CC0099;">+------+--------+------------+</span>
<span style="color: #008080;">1</span> row <span style="color: #990099; font-weight: bold;">in</span> <span style="color: #990099; font-weight: bold;">set</span> <span style="color: #FF00FF;">&#40;</span><span style="color: #008080;">0.00</span> sec<span style="color: #FF00FF;">&#41;</span>
&nbsp;
mysql<span style="color: #CC0099;">&gt;</span> <span style="color: #990099; font-weight: bold;">SELECT</span> <span style="color: #CC0099;">*</span> <span style="color: #990099; font-weight: bold;">FROM</span> rt <span style="color: #990099; font-weight: bold;">WHERE</span> <span style="color: #990099; font-weight: bold;">MATCH</span><span style="color: #FF00FF;">&#40;</span><span style="color: #008000;">'我'</span><span style="color: #FF00FF;">&#41;</span><span style="color: #000033;">;</span>
Empty <span style="color: #990099; font-weight: bold;">set</span> <span style="color: #FF00FF;">&#40;</span><span style="color: #008080;">0.00</span> sec<span style="color: #FF00FF;">&#41;</span>
&nbsp;
mysql<span style="color: #CC0099;">&gt;</span></pre></div></div>
<p>简单方便，码完收工。</p>
]]></content:encoded>
			<wfw:commentRss>http://www.fuchaoqun.com/2010/06/sphinx-real-time-index/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>
<!-- WP Super Cache is installed but broken. The path to wp-cache-phase1.php in wp-content/advanced-cache.php must be fixed! -->
