phinde.git
7 years agoAlways show text, make text extract size configurable.
Christian Weiske [Thu, 1 Sep 2016 05:47:49 +0000 (07:47 +0200)]
Always show text, make text extract size configurable.

Resolves: #8

7 years agoremove anchor from source URLs
Christian Weiske [Thu, 1 Sep 2016 05:38:08 +0000 (07:38 +0200)]
remove anchor from source URLs

7 years agotell why crawler stops
Christian Weiske [Tue, 30 Aug 2016 19:37:50 +0000 (21:37 +0200)]
tell why crawler stops

7 years agoAdd crawlBlacklist configuration option
Christian Weiske [Tue, 30 Aug 2016 11:35:05 +0000 (13:35 +0200)]
Add crawlBlacklist configuration option

Resolves: #7

7 years agoAllow worker instances of multiple projects in parallel
Christian Weiske [Tue, 30 Aug 2016 11:10:03 +0000 (13:10 +0200)]
Allow worker instances of multiple projects in parallel

Change "queuePrefix" configuration in each project

Resolves: #5

7 years agoFix notice
Christian Weiske [Tue, 30 Aug 2016 11:05:14 +0000 (13:05 +0200)]
Fix notice

7 years agoMake phinde-worker configurable; allow queue selection
Christian Weiske [Tue, 30 Aug 2016 11:03:26 +0000 (13:03 +0200)]
Make phinde-worker configurable; allow queue selection

Resolves #6

7 years agoOption to disable linked URL indexing
Christian Weiske [Tue, 30 Aug 2016 06:13:33 +0000 (08:13 +0200)]
Option to disable linked URL indexing

Resolves: #2

7 years agoAdd support for modification date queries: "before:", "after:" and "date:"
Christian Weiske [Tue, 30 Aug 2016 06:05:00 +0000 (08:05 +0200)]
Add support for modification date queries: "before:", "after:" and "date:"

Resolves: #4

7 years agoSupport "nick:cweiske" search syntax as alias for "author.name"
Christian Weiske [Tue, 30 Aug 2016 05:36:34 +0000 (07:36 +0200)]
Support "nick:cweiske" search syntax as alias for "author.name"

Resolves: #3

7 years agoRespect <meta name="robots" content="noindex"/>
Christian Weiske [Mon, 29 Aug 2016 20:59:16 +0000 (22:59 +0200)]
Respect <meta name="robots" content="noindex"/>

Fixes: #1
7 years agoSend If-Modified-Since header on crawling and indexing
Christian Weiske [Mon, 29 Aug 2016 18:30:45 +0000 (20:30 +0200)]
Send If-Modified-Since header on crawling and indexing

7 years agoadd LICENSE file
Christian Weiske [Thu, 26 May 2016 13:20:23 +0000 (15:20 +0200)]
add LICENSE file

8 years agowip pubsubhubbub
Christian Weiske [Thu, 31 Mar 2016 18:46:01 +0000 (20:46 +0200)]
wip pubsubhubbub

8 years agoopensearch paging
Christian Weiske [Fri, 12 Feb 2016 16:04:42 +0000 (17:04 +0100)]
opensearch paging

8 years agotrim query string
Christian Weiske [Fri, 12 Feb 2016 06:43:25 +0000 (07:43 +0100)]
trim query string

8 years agoopensearch support v0.1.0
Christian Weiske [Thu, 11 Feb 2016 21:43:34 +0000 (22:43 +0100)]
opensearch support

8 years agosupport base href
Christian Weiske [Thu, 11 Feb 2016 19:02:30 +0000 (20:02 +0100)]
support base href

8 years agosanitize title better
Christian Weiske [Thu, 11 Feb 2016 16:37:12 +0000 (17:37 +0100)]
sanitize title better

8 years agouse correct meta robots attribute
Christian Weiske [Thu, 11 Feb 2016 16:00:58 +0000 (17:00 +0100)]
use correct meta robots attribute

8 years agodebug option for crawler
Christian Weiske [Thu, 11 Feb 2016 07:43:01 +0000 (08:43 +0100)]
debug option for crawler

8 years agoadd date sorting
Christian Weiske [Wed, 10 Feb 2016 21:02:11 +0000 (22:02 +0100)]
add date sorting

8 years agoremove debug statement
Christian Weiske [Wed, 10 Feb 2016 20:15:35 +0000 (21:15 +0100)]
remove debug statement

8 years agocrawler supports "nofollow" now
Christian Weiske [Wed, 10 Feb 2016 16:26:15 +0000 (17:26 +0100)]
crawler supports "nofollow" now

8 years agosend accept header during crawl
Christian Weiske [Wed, 10 Feb 2016 16:09:56 +0000 (17:09 +0100)]
send accept header during crawl

8 years agosome styling, noindex for search result pages
Christian Weiske [Wed, 10 Feb 2016 14:14:34 +0000 (15:14 +0100)]
some styling, noindex for search result pages

8 years agorework crawler; add atom link extraction
Christian Weiske [Wed, 10 Feb 2016 13:56:20 +0000 (14:56 +0100)]
rework crawler; add atom link extraction

8 years agoabout section readme
Christian Weiske [Sat, 6 Feb 2016 19:27:58 +0000 (20:27 +0100)]
about section readme

8 years agoadd site GET parameter
Christian Weiske [Fri, 5 Feb 2016 05:48:45 +0000 (06:48 +0100)]
add site GET parameter

8 years agodefault config
Christian Weiske [Thu, 4 Feb 2016 22:59:52 +0000 (23:59 +0100)]
default config

8 years agodo not exit on null query
Christian Weiske [Thu, 4 Feb 2016 22:58:00 +0000 (23:58 +0100)]
do not exit on null query

8 years agocheck for content attributes
Christian Weiske [Thu, 4 Feb 2016 22:55:41 +0000 (23:55 +0100)]
check for content attributes

8 years agoremove multiple tags
Christian Weiske [Thu, 4 Feb 2016 22:46:45 +0000 (23:46 +0100)]
remove multiple tags

8 years agodo not show filter headline if there are none
Christian Weiske [Thu, 4 Feb 2016 16:23:14 +0000 (17:23 +0100)]
do not show filter headline if there are none

8 years agoshow query time
Christian Weiske [Thu, 4 Feb 2016 16:20:23 +0000 (17:20 +0100)]
show query time

8 years agochange default query operator to AND
Christian Weiske [Thu, 4 Feb 2016 16:12:14 +0000 (17:12 +0100)]
change default query operator to AND

8 years agoShow site search reset link
Christian Weiske [Thu, 4 Feb 2016 16:10:49 +0000 (17:10 +0100)]
Show site search reset link

8 years agoescape html in search results
Christian Weiske [Thu, 4 Feb 2016 15:58:33 +0000 (16:58 +0100)]
escape html in search results

8 years agofix indexing, boost config
Christian Weiske [Wed, 3 Feb 2016 21:37:15 +0000 (22:37 +0100)]
fix indexing, boost config

8 years agono simplexml anymore, content extraction improvements
Christian Weiske [Wed, 3 Feb 2016 21:18:52 +0000 (22:18 +0100)]
no simplexml anymore, content extraction improvements

8 years agofollow redirect, do not verify ssl certificates, use final after-redirect url
Christian Weiske [Wed, 3 Feb 2016 20:25:34 +0000 (21:25 +0100)]
follow redirect, do not verify ssl certificates, use final after-redirect url

8 years agoadd site search, highlighting
Christian Weiske [Wed, 3 Feb 2016 20:12:17 +0000 (21:12 +0100)]
add site search, highlighting

8 years agoshow elasticsearch query time
Christian Weiske [Wed, 3 Feb 2016 19:03:35 +0000 (20:03 +0100)]
show elasticsearch query time

8 years agofiltering works
Christian Weiske [Wed, 3 Feb 2016 16:23:06 +0000 (17:23 +0100)]
filtering works

8 years agofirst frontend
Christian Weiske [Wed, 3 Feb 2016 05:21:30 +0000 (06:21 +0100)]
first frontend

8 years agofirst kinda working version
Christian Weiske [Mon, 1 Feb 2016 19:18:59 +0000 (20:18 +0100)]
first kinda working version