Search Feed API pricing and info
Spell Checking API pricing and info
NOTE: All APIs support both GET and POST method. If the size of your request is more than 2K you should use POST.
NOTE: All APIs support both http and https protocols.
API by pages
/search
/get
/admin/master
Only available to admin
/admin/search
Only available to admin
/admin/spider
Only available to admin
/admin/proxies
Only available to admin
/admin/log
Only available to admin
/admin/masterpasswords
Only available to admin
/admin/addcoll
Only available to admin
/admin/delcoll
Only available to admin
/admin/clonecoll
Only available to admin
/admin/hosts
Only available to admin
/admin/stats
Only available to admin
/account/adduser
Only available to admin
/account/edituser
Only available to admin
/account/addad
Only available to admin
/account/editad
Only available to admin
/account/showad
Only available to admin
/account/pausead
Only available to admin
/account/resumead
Only available to admin
/account/deletead
Only available to admin
/account/deposit
Only available to admin
/account/refund
Only available to admin
/account/showuser
Only available to admin
/account/showusers
Only available to admin
/admin/rebuild
Only available to admin
/admin/reindex
Only available to admin
/admin/scale
Only available to admin
/admin/inject
Only available to admin
/admin/addurl
Only available to admin
Spell Checking API pricing and info
NOTE: All APIs support both GET and POST method. If the size of your request is more than 2K you should use POST.
NOTE: All APIs support both http and https protocols.
API by pages
- /search - search results page
- /get - gets cached web page
- /admin/master - master controls
- /admin/search - search controls
- /admin/spider - spider controls
- /admin/proxies - proxies
- /admin/log - log controls
- /admin/masterpasswords - master passwords
- /admin/addcoll - add a new collection
- /admin/delcoll - delete a collection
- /admin/clonecoll - clone one collection's settings to another
- /admin/hosts - hosts status
- /admin/stats - general statistics
- /account/adduser - add a new user to the accounting system
- /account/edituser - update a user's information
- /account/addad - add an advertisement that will be displayed in the search results
- /account/editad - update a search advertisement
- /account/showad - show the details of a search ad
- /account/pausead - pause a search ad
- /account/resumead - resume a search ad
- /account/deletead - delete a search ad
- /account/deposit - deposit money into a user's account from their credit card on file
- /account/refund - make a refund of a deposit transaction to a user's current credit card
- /account/showuser - show the details of a user
- /account/showusers - show a list of all the users, only available for the master admin
- /admin/rebuild - rebuild data
- /admin/reindex - query delete/reindex
- /admin/scale - page scale
- /admin/inject - inject url in the index here
- /admin/addurl - add url page for admin
/search
# | Parm | Type | Title | Default Value | Description | |||
1 | format | STRING | output format | html | Display output in this format. Can be html, json or xml. | |||
2 | q | STRING | query | The query to perform. See help. See the query operators below for more info. REQUIRED | ||||
3 | c | STRING | collection | Search this collection. Use multiple collection names separated by a whitespace to search multiple collections at once. REQUIRED | ||||
4 | aicompanion | INT32 | ai companion | 0 | Values of 1 through 4 select the correspondong AI companion. | |||
5 | n | INT32 | number of results per query | 25 | The number of results returned. If you want more than 1000 results you must use &stream=1 so Gigablast does not run out of memory. Search feed customers are typically limited to 10 results per query, so additional queries must be conducted to receive more results. | |||
6 | searchtype | STRING | searchtype | Set to news or images to search for those respective entities. | ||||
7 | s | INT32 | first result num | 0 | Start displaying at search result #X. Starts at 0. If you want more than 1000 results in total, you must use &stream=1 so Gigablast does not run out of memory. | |||
8 | uip | STRING | user ip | The ip address of the searcher. We can pass back for use in the autoban technology which bans abusive IPs. Required for for maintaining statistics for ads if search results are in a JSON or XML feed where the searcher IP is not directly provided by the connecting socket. | ||||
9 | showerrors | BOOL (0 or 1) | show errors | 0 | Show errors from generating search result summaries rather than just hide the docid. Useful for debugging. | |||
10 | showanomalies | BOOL (0 or 1) | show anomalies | 0 | Show search results that only contain the query terms in some anomalous link texts. | |||
11 | showurlasname | BOOL (0 or 1) | show url as name | 0 | Show the website name instead of the url itself in the search results. Used by news search. | |||
12 | showimages | BOOL (0 or 1) | show images | 0 | Should we return or show the thumbnail images in the search results? | |||
13 | showgoodimages | BOOL (0 or 1) | show images if good match | 1 | Should we return or show the thumbnail images in the search results if they are close to all the search terms? | |||
14 | sc | BOOL (0 or 1) | site clustering | 0 | Should search results be site clustered? This limits each site to appearing at most twice in the search results. Sites are subdomains for the most part, like abc.xyz.com. | |||
15 | tc | CHAR | topic clustering | 0 | Should search results be clustered by topic? Used by news search. | |||
16 | ppt | CHAR | promote popular topics | 0 | If topic clustering is enabled, should we promote highly clustered results to the top? A form of re-ranking. | |||
17 | hacr | BOOL (0 or 1) | hide all clustered results | 0 | Only display at most one result per site. | |||
18 | dr | BOOL (0 or 1) | dedup results | 0 | Should similar search results be removed? | |||
19 | qe | INT32 | query expansion level | variable | 0 means to not do any query expansion. 1 means to expand query terms to basic word endings (like -ing -ed -s in the case of English, etc.) and 2 means to do basic word endings plus synonym expansion. Typically, the more expansion you do the slower search response time is. So use 0 if you care more about fast response times. If you are doing queries for spell checking using &n=0&spell=1 then you will need to specify &qe=1 or &qe=2 in order to get synonyms. | |||
20 | spell | BOOL (0 or 1) | do spell checking | variable | If enabled, when Gigablast finds a spelling recommendation it will be included in the XML <spellingSuggestion> tag or JSON "spellingSuggestion" field. Default is 0 if using an XML or JSON feed, 1 otherwise. | |||
21 | autospell | BOOL (0 or 1) | auto correct spelling | variable | If enabled, when Gigablast is CONFIDENT of a spelling recommendation it will automatically re-perform the query with the recommended spelling. Spell checking must be enabled for this to work. This is a default value and can be overriden directly with the autospell parm in each individual http request. | |||
22 | relqueries | BOOL (0 or 1) | show related queries | 0 | Offer related queries at bottom of search results. | |||
23 | dmoz | BOOL (0 or 1) | display dmoz categories in results | 0 | If enabled, results in dmoz will display their categories on the results page. The url itself must be explicitly in the DMOZ category. | |||
24 | idmoz | BOOL (0 or 1) | display indirect dmoz categories in results | 0 | If enabled, results in dmoz will display their indirect categories on the results page. That is, the categories of which their root url is a member. | |||
25 | stream | CHAR | stream search results | 0 | Stream search results back on socket as they arrive. Useful when thousands/millions of search results are requested. Required when doing such things otherwise Gigablast could run out of memory. Only supported for JSON and XML formats, not HTML. You must use this if you want more than 1000 results. | |||
26 | secsback | INT32 | seconds back | 0 | Limit to results with pub dates from this many seconds ago. Use 0 to disable. | |||
27 | usetime | INT64 | use time | 0 | Use this provided UTC timestamp rather than the current time for secsback or for news search. Helps with debugging. 0 means to ignore it. | |||
28 | filetype | STRING | filetype | Restrict results to this filetype. Supported filetypes are pdf, doc, html xml, json, xls. | ||||
29 | facet | STRING | facet query term | A query term that is prepended to the query. i.e. &facet=gbfacetint%3Atype for document type facets. | ||||
30 | fast | BOOL (0 or 1) | fast results | 0 | Sacrifice some quality and result filtering for the sake of speed. | |||
31 | nf | INT32 | max number of facets to return | 50 | Max number of facets to return | |||
32 | snik | FLOAT32 | site popularity weight | 1.000000 | If this is 1.0 then we weight more popular sites more in the results. If it is 0.0 then we do not do any such weighting. And if it is anything in between it is a linearly proportional effect. | |||
33 | qlangcountry | STRING | country lang preference | Use the specified country and language. Example: en-us or en-uk or de-de, etc. | ||||
34 | qcountry | STRING | sort country preference | us | Default country to use for ranking results. Value should be any country code abbreviation, for example "us" for United States. | |||
35 | qlang | STRING | sort language preference | en | Default language to use for ranking results. Value should be any language abbreviation, for example "en" for English. Use xx to give ranking boosts to no language in particular. See the language abbreviations at the bottom of the url filters page. | |||
36 | langw | FLOAT32 | language weight | variable | Use this to override the default language weight for this collection. The default language weight can be set in the search controls and is usually something like 40.0. Which means that we multiply a result's score by 40.0 if from the same language as the query or the language is unknown. | |||
37 | onlylang | STRING | language restrictions | All documents returned will be in this language. | ||||
38 | ns | INT32 | number of summary excerpts | variable | How many summary excerpts to display per search result? | |||
39 | link | STRING | restrict search to pages that link to this url | The url which the pages must link to. From the advance search page. All returned results will link to the specified url. Example: &link=http://www.foo.com/ | ||||
40 | sites | STRING | restrict results to these sites | Returned results will have URLs from these space-separated list of sites. Can have up to 200 sites. A site can include sub folders. This is allows you to build a Custom Topic Search Engine. Example: ?q=test&sites=foo.com+bar.com+baz.com | ||||
41 | docids | STRING | restrict results to these docids | Returned results will be from this space-separated list of docIds. Can have up to 200. This is used for the View Full Coverage link for news search, among other things. Example: ?q=test&docids=12345678+9877665432 | ||||
42 | ff | BOOL (0 or 1) | family filter | 0 | Remove objectionable results if this is enabled. | |||
43 | qh | BOOL (0 or 1) | highlight query terms in summaries | 1 | Use to disable or enable highlighting of the query terms in the summaries. | |||
44 | hq | STRING | cached page highlight query | Highlight the terms in this query instead. | ||||
45 | showcached | BOOL (0 or 1) | show cached links | 1 | Show cached links next to each result? For HTML output only, of course. | |||
46 | bq | INT32 | boolean status | 2 | Can be 0 or 1 or 2. 0 means the query is NOT boolean, 1 means the query is boolean and 2 means to auto-detect. | |||
47 | dt | STRING | meta tags to display | A space-separated string of meta tag names. Do not forget to url-encode the spaces to +'s or %%20's. Gigablast will extract the contents of these specified meta tags out of the pages listed in the search results and display that content after each summary. i.e. &dt=description will display the meta description of each search result. &dt=description:32+keywords:64 will display the meta description and meta keywords of each search result and limit the fields to 32 and 64 characters respectively. When used in an XML feed the <display name="meta_tag_name">meta_tag_content</> XML tag will be used to convey each requested meta tag's content. | ||||
48 | rdc | BOOL (0 or 1) | return number of docs per topic | 1 | Use 1 if you want Gigablast to return the number of documents in the search results that contained each topic (gigabit). | |||
49 | rd | BOOL (0 or 1) | return docids per topic | 0 | Use 1 if you want Gigablast to return the list of docIds from the search results that contained each topic (gigabit). | |||
50 | dio | BOOL (0 or 1) | return docids only | 0 | Is 1 to return only docids as query results. | |||
51 | prepend | STRING | prepend | prepend this to the supplied query followed by a |. | ||||
52 | sb | BOOL (0 or 1) | show banned pages | 0 | show banned pages | |||
53 | icc | INT32 | include cached copy of page | 0 | Will cause a cached copy of content to be returned instead of summary. |
/get
# | Parm | Type | Title | Default Value | Description | |||
1 | format | STRING | output format | html | Display output in this format. Can be html, json or xml. | |||
2 | d | INT64 | docId | 0 | The docid of the cached page to view. REQUIRED | |||
3 | url | STRING | url | Instead of specifying a docid, you can get the cached webpage by url as well. REQUIRED | ||||
4 | c | STRING | collection | Get the cached page from this collection. REQUIRED | ||||
5 | strip | INT32 | strip | 0 | Is 1 or 2 two strip various tags from the cached content. | |||
6 | ih | BOOL (0 or 1) | include header | 1 | Is 1 to include the Gigablast header at the top of the cached page, 0 to exclude the header. | |||
7 | q | STRING | query | Highlight this query in the page. |
<response> <statusCode>0</statusCode> <statusMsg>Success</statusMsg> <url><![CDATA[http://www.doi.gov/]]></url> <docId>34111603247</docId> <cachedTimeUTC>1404512549</cachedTimeUTC> <cachedTimeStr>Jul 04, 2014 UTC</cachedTimeStr> <content><![CDATA[<html><title>Some web page title</title><head>My first web page</head></html>]]></content> </response> |
{ "response":{ "statusCode":0, "statusMsg":"Success", "url":"http://www.doi.gov/", "docId":34111603247, "cachedTimeUTC":1404512549, "cachedTimeStr":"Jul 04, 2014 UTC", "content":"<html><title>Some web page title</title><head>My first web page</head></html>" } } |
/admin/master
Only available to admin
/admin/search
Only available to admin
/admin/spider
Only available to admin
/admin/proxies
Only available to admin
/admin/log
Only available to admin
/admin/masterpasswords
Only available to admin
/admin/addcoll
Only available to admin
/admin/delcoll
Only available to admin
/admin/clonecoll
Only available to admin
/admin/hosts
Only available to admin
/admin/stats
Only available to admin
/account/adduser
Only available to admin
/account/edituser
Only available to admin
/account/addad
Only available to admin
/account/editad
Only available to admin
/account/showad
Only available to admin
/account/pausead
Only available to admin
/account/resumead
Only available to admin
/account/deletead
Only available to admin
/account/deposit
Only available to admin
/account/refund
Only available to admin
/account/showuser
Only available to admin
/account/showusers
Only available to admin
/admin/rebuild
Only available to admin
/admin/reindex
Only available to admin
/admin/scale
Only available to admin
/admin/inject
Only available to admin
/admin/addurl
Only available to admin
privacy syntax api login