Sphider
a PHP spider and search engine
Sphider News
Sphider 5.5.1. Sphiderlite 2.6.1 have been released
6 October 2024 - The search now uses an updated jQuery. A change in buffering of the output to stdout during an indexing run reduces the lag time (and just MAY reduce the occurence of the dreaded 500 error). A fix was developed for the rare case of a "pdf" file being mis-identified in the headers as "text".
While an upgrade can be done by simply replacing the files noted in the changelog, then running /admin/version_update.php, a clean install is recommended. If you chose just to replace files, it is CRUCIAL that you run version_update.php!
Sphider 5.5.0. Sphiderlite 2.6.0 have been released
20 December 2023 - An error in which checking "Index decimals" in settings caused pages to not index was corrected. Along with this fix, the option of choosing which decimal separator is to be used, decimal period or decimal comma. Prior to this, Sphider just assumed the separator was a decimal period, which excluded half the world which uses a decimal comma. Indexing numbers in general was improved by stripping the thousands separator in large numbers. In addition, the ability to actually search for indexed decimals has been added.
Also new is the ability for sitemaps which are compressed to be read (*.xml.gz). The limitation is that the uncompressed file size must not exceed 100,000 bytes. This should still be enough to link 500 url's per xml.gz file.
One column name in the settings table was changed to avoid a conflict with a MySQL reserved word.
There was also several code deprecation fixes applied. Testing is being done using PHP 8.3.
The Sphiderlite version got a fix for "Update settings" actually corrupting the settings file.
While an upgrade can be done by simply replacing the files noted in the changelog, then running /admin/version_update.php, a clean install is recommended. If you chose just to replace files, it is CRUCIAL that you run version_update.php!
Sphider 5.4.1 has been released
9 November 2023 - An error in the number of arguments which led to a fatal error when attempting to index a new, site from the Index tab has been correcte. SItes already in the database were unaffected. Also one minor code depracation was corrected. SphiderLite is unaffected by these fixes.
Sphider 5.4.0, SphiderLite 2.5.0 have been released
15 October 2023 - Processing of robots.txt files has been improved. Robots.txt is now case sensitive and consideration is given to "allow" directives. All common text files have been integrated into Sphider. The user may assign a default language to a web site, but Sphider will also try to detect the language used on each page and use the appropriate common text set. A new feature is the introduction of the possibility of setting built in pauses during indexing. Running from a command prompt, user help has been updated for better instruction in the use of "must-include" and "must-not-include" directives. The possibility of having 'index to" level being blank has been fixed.
In the full version, Sphider not obeying the "must-not-include" directives during image indexing has been corrected. Also fixed was Sphider not picking up the width, height, and alt attributes in the img tag. Additionally, 'jpeg', 'webp', and 'svg' files are now recognized. Support for 'tif' image files has been dropped. (Does anyone even use tif/tiff any more?)
The User Guide has also been updated.
Sphider 5.3.0, SphiderLite 2.4.0 have been released
4 September 2023 - Multibyte string function emulations have been removed. The PHP mbstring extension is now required. The PHP 8.3 deprecation solution has been finalized and no action is needed by the user. J-query has been updated, the code has been (again) cleaned to PSR-2 standards, and the basic search page cleaned of references to categories when categories are not presented. In SphiderLite, the created sitemap naming convention has been changed to match the full version.
While an upgrade can be done by simply replacing the files noted in the changelog, then running /admin/version_update.php, a clean install is recommended.
Sphider 5.2.1, SphiderLite 2.3.1 have been released
26 August 2023 - A premptive update to replace the PHP rand() function (which will be deprecated in PHP 8.3) had to be updated. The fix works for PHP 8.2 and greater, but PHP 8.1 and under fails. By default, Sphider 5.2.1 and SphiderLite 2.3.1 will work with any PHP < PHP 8.3. The user can change the code to work with PHP 8.2 and greater.
While a clean install is recommended, an upgrade can be done by simply replacing the files noted in the changelog, then running /admin/version_update.php.
Sphider 5.2.0, SphiderLite 2.3.0 have been released
22 August 2023 - Sphider 5.2.0 and SphiderLite 2.3.0 have a critical fix to the repair a broke "Did you mean" spelling suggestions function. Also, a premptive update was made to replace the PHP rand() function which will be deprecated in PHP 8.3. Additionally, the User Agent string in the Lite version has been updated to match that of the full version.
While a clean install is recommended, an upgrade can be done by simply replacing the files noted in the changelog, then running /admin/version_update.php.
Sphider 5.0.0, SphiderLite 2.2.1 have been released
17 July 2022 - SphiderLite 2.2.1 has improvemnts to character set determination and better filtering. Fixes to code deprecations also continues. Sphider 5.0.0 has the same improvements and has also added a new feature. It is able to generate a report of all links referenced by each page indexed. The default User Agent string has been changed, which may help prevent some "false" 301 errors when indexing.
While a clean install is recommended, an upgrade can be done by simply replacing the files noted in the changelog, then running /admin/version_update.php.
Sphider 5.1.0 has been released
22 July 2023 - Sphider 5.1.0 has a critical fix to the indexAll() function. There was missing code which presented a mismatch in the argument count.
While a clean install is recommended, an upgrade can be done by simply replacing the files noted in the changelog, then running /admin/version_update.php.
Sphider 5.0.0, SphiderLite 2.2.1 have been released
17 July 2022 - SphiderLite 2.2.1 has improvemnts to character set determination and better filtering. Fixes to code deprecations also continues. Sphider 5.0.0 has the same improvements and has also added a new feature. It is able to generate a report of all links referenced by each page indexed. The default User Agent string has been changed, which may help prevent some "false" 301 errors when indexing.
While a clean install is recommended, an upgrade can be done by simply replacing the files noted in the changelog, then running /admin/version_update.php.
Sphider 4.2.0, SphiderLite 2.2.0 have been released
1 July 2022 - Various bugs caused by PHP 8.1 have been fixed. Many instances of deprecated code have been fixed. One result of fixing deprecated code is that Sphider and SphiderLite will no longer function under PHP 5, which is obsolete anyway. PHP 7.0.0 or greater is required. A phrase search issue has been corrected. An issue with some unicode corruption has been fixed. New protection against insanely long urls (>255 characters) may result in 400 (or 404) errors on such pages, but it protects the database.
While a clean install is recommended, an upgrade can be done by simply replacing the files noted in the changelog, then running /admin/version_update.php.
Sphider 4.1.0-MB, SphiderLite 2.1.0 have been released
18 April 2022 - This update removes the re-index restart capability (This is why.) and includes improved indexing using sitemaps.
To upgrade, simply replace the files noted in the changelog, then run /admin/version_update.php.
Sphider 4.0.2-MB, SphiderLite 2.0.2 have been released
27 February 2022 - This is a minor update. Testing revealed a couple PHP 8 vulnerbilities. These have been corrected.
To upgrade, simply replace and add the files noted in the changelog, then run /admin/version_update.php.
PHP 8.1
16 December 2021 - The most recent releases (Sphider 4.0.1-MB and SphiderLite 2.0.1) have been running under PHP 8.1 for two weeks on a test machine. No errors or warnings so far. Testing will continue, but it looks safe under PHP 8.1.
Sphider 4.0.1-MB, SphiderLite 2.0.1 have been released
28 July 2021 - This is a minor update. Russian stemming has been added. The are a few miscellaneous updates and corrections.
Sphider 4.0.0-MB, SphiderLite 2.0.0 have been released
18 January 2021 - The backup and restore utilities have been reworked to use MySQL directly. This provides higher dependability than depending on PHP. Also, a limited ability to resume a re-index process which has been interrupted has been introduced. The process to determine page character set has been enhanced. Language file conversion to Unicode has been completed. Obsolete versions of code have been removed and general code cleanup done. Further safeguards against indexing of illegal characters has been implemented. SphiderLite has had more remnants of the full version removed.
SphiderLite 1.3.1 has been released
31 December 2020 - A critical flaw was discovered in the SphiderLite indexing code. This has been corrected in this version. ALL USERS OF PREVIOUS VERSIONS OF SPHIDERLITE ARE URGED TO UPGRADE TO 1.3.1! In addition to fixing this CRITICAL FLAW, SphiderLite has also improved the determination of a page character set, has improved filtering while indexing keywords, and improved removal of emojis.
Sphider 3.6.0-MB and SphiderLite 1.3.0 have been released
26 December 2020 - A potential runaway regular expression resulting in missing titles has been corrected. Crawl performance has been improved by fixing a bug that caused Sphider to try to crawl pages returning codes like 301, 401, 403, and 404. The absence of a robots.txt file on sites being crawled was generating warning errors, and this has been corrected. More potential PHP 8 errors have been averted. More obsolete code has been removed. The MB version now reports when a feed becomes invalid.
SphiderLite 1.2.3 has been released
20 December 2020 - A number of PHP warnings were removed, all of which could have been fatal in PHP 8. Also a minor improvement was made when parsing title tags. This is the companion to yesterday's Sphider-3.5.3-MB release.
Sphider 3.5.3-MB has been released
19 December 2020 - More deprecated code was removed and a number of PHP warnings, all of which could have been fatal in PHP 8. Also a minor improvement was made when parsing title tags. A similar change will be released for SphiderLite very soon.
Sphider 3.5.2-MB and SphiderLite 1.2.2 have been released
9 December 2020 - It was found that when a search returned results, if a new search was performed which should have found nothing, the previous search was returned! This was corrected.
Sphider 3.5.1-MB and SphiderLite 1.2.1 have been released
8 December 2020 - A correction was made to a unicode function that will allow faster searches for installations with the mbstring extension installed.
Sphider 3.5.0-MB and SphiderLite 1.2.0 have been released
6 December 2020 - The text search function was split to improve efficiency. Previously, the search was repeated for each page of results displayed. Now the search is performed only once, and the appropriate subset displayed for each page viewed.
Sphider 3.4.5-MB and SphiderLite 1.1.5 have been released
30 November 2020 - Deprecated code which would have caused an error in PHP 8 was removed. A parsing issue with the robots.txt file was resolved. Two options were removed from the database tab. The Optimize function has been inert since Sphider changed to the Innodb database structure, and the Truncate function proved to be risky due to the addition of foreign key constraints.
Sphider 3.4.4-MB and SphiderLite 1.1.4 have been released
7 November 2020 - Fixed a critical error in the truncate tables routine.
Sphider 3.4.3-MB, SphiderLite 1.1.3, and Sphider 2.4.3-PDO has been released
30 August 2020 - These are a minor releases. Corrected several issues due to PHP 7.3 and 7.4. The PDO release is a one time maintainence release to keep it functioning.
Sphider 3.4.2-MB and SphiderLite 1.1.2 has been released
15 August 2020 - This is a minor release. Some code in stem_class.php had become deprecated in PHP 7.4 and was updated.
Sphider 3.4.1-MB and SphiderLite 1.1.1 has been released
18 July 2020 - This is a minor release. The only change is that the version of jquery used in searching has been updated to 3.5.1.
Typo corrected in tables.sql
5 July 2020 - A typo was discovered in the tables.sql script for both Sphider aqnd SphiderLite. This has been corrected and any future downloads will reflect the correction. If you previously downloaded Sphider 3.4.0-MB or SphiderLite 1.1.0 and install using the SQL script, you may encounter an error. The correction follows:
Line 239 (237 in SphiderLite) in tables.sql is:
ENGINE=InnoDB DEFAULT CHARSET=utf8mb4COLLATE=utf8mb4_general_ci;
It should be:
ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci;
Sphider 3.4.0-MB and SphiderLite 1.1.0 has been released
7 December 2019 - Both of these have a couple minor bug fixes and a new feature. It is now possible to create a sitemap from any previously indexed site.
SphiderLite 1.0.0 has been released
7 November 2019 - SphiderLite has the same site indexing capabilities as the regular version, but has had image and RSS functions removed.
Sphider 3.3.0-MB has been released
7 September 2019 - The database was altered to provide foreign key restraints. The admin functions were simplified to take advantage of database control provided by the added restraints. Sphider functionality is identical to the prior version.
Sphider 3.2.1-MB has been released
3 August 2019 - A fix was made to correct index omissions of words containing certain compound characters.
Sphider 3.2.0-MB has been released
16 June 2019 - A fix was made to correct missing keyword-link relationships. New features are the ability to set a minimum score for search results and the ability to use a 5 star system to rank the relevancy of results. Iframe support has been added.
Sphider 3.1.1-MB and 2.4.2-PDO has been released
16 May 2019 - A fix was made to correct indexing of words composed of non-English characters.
Sphider 3.1.0-MB and 2.4.2-PDO have been released
15 May 2019 - Sphider 3.1.0-MB replaces both Sphider 2.4.1 and 3.0.0-MB. Like 3.0.0-MB, 3.1.0-MB is multibyte string capabable, but without the requirement for the PHP mbstring extension. If mbstring is present, it will be used. If not, it will be emulated. Several character set encoding issues have been resolved, word recognition improved, more reliable meta description indexing, and presentational improvements have been made. Sphider 2.4.2-PDO has eliminated the possibility of certain UTF-8 characters being misinterpreted as ISO-8859-1. As previously announced, the PDO branch will remain available and critical fixes will be made as needed, but no future product enhancements are planned.
Versions 2.4.1, 2.4.1-PDO, and 3.0.0-MB have been released
29 April 2019 - Sphider 2.4.1 and 2.4.1-PDO are maintenance releases. Primary code changes involve how SQL errors are reported in the event a statement preparation fails. Preparation errors should never occur, but better to have good information in the off-chance it does happen. Also, a coding error introduced in 2.4.0 that prevented indexing from a browser in some instances has been corrected.
Sphider 3.0.0-MB is actually 2.4.1 with one BIG change... String handling is now equipped to handle multi-byte character strings. Since the PHP mbstring extension is used, and some hosts do not always provide this extension, Sphider 3.0.0-MB may not run for everyone. Also, being the beginning of a new approach to indexing and searching, some people may feel more comfortable sticking with what they KNOW will work! We feel this will ultimately prove to be the better way.
Version 2.4.0 Released, Legacy and PDO
10 April 2019 - Sphider 2.4.0 has completely re-worked the settings table, giving a number of new search options. Word stemming has been updated and now works with a number of languages other than English. A number of bugs have been corrected. The selection of search templates has been changed with elimination and replacement for the majority.
Downloads of SQLite and PostgreSQL to end
16 March 2019 - The download links for the SQLite and PostgreSQL editions of Sphider will be removed in April. Forum support for these editions will continue.
Version 2.3.1 Released, Legacy and PDO
5 March 2019 - Sphider 2.3.1 has corrected the issue with indexing and searching for words composed of non-Western characters.
Version 2.3.0 Distribution on Hold
4 March 2019 - Sphider 2.3.0 is being re-evaluated due to some unresolved issues with searches in foreign languages. The 2.2.0 versions are still available in the interim.
Version 2.3.0 Released
25 Feb 2019 - Sphider 2.3.0 and 2.3.0-PDO have been released. Consult the change logs to see what's new in this release. The Sphider User Guide is no longer included in the download packages, but is still available as a separate download. The User Guide has also been updated to reflect the changes.
Updates discontinued for SQLite and PostgreSQL
1 December 2018 - The decision was made not to produce any new updates for the SQLite and PostgreSQL forks of Sphider. They are frozen at release 2.1.0. There just hasn't been sufficient demand to make the time and effort required to maintain these "novelties" worthwhile. User support will still be provided via the forum.