{"id":83,"date":"2026-03-18T20:57:51","date_gmt":"2026-03-18T20:57:51","guid":{"rendered":"https:\/\/byte64.com\/?p=83"},"modified":"2026-03-18T21:01:05","modified_gmt":"2026-03-18T21:01:05","slug":"generating-user-data","status":"publish","type":"post","link":"https:\/\/byte64.com\/?p=83","title":{"rendered":"Generating User Data"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">A search engine depends on a feedback loop of users making queries, following links, returning to the search page and rewriting their queries. All these feed into an understanding of whether they&#8217;re finding the results they&#8217;re looking for.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Bootstrapping a system like this is difficult because you don&#8217;t start with any users and your search results aren&#8217;t tuned. Google search used the <a href=\"https:\/\/en.wikipedia.org\/wiki\/PageRank\" data-type=\"link\" data-id=\"https:\/\/en.wikipedia.org\/wiki\/PageRank\">PageRank algorithm<\/a> to get started. PageRank looks at characteristics of the links in and out of each page to determine it&#8217;s quality and that quality is used for the initial scoring absent user data.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">I suppose I could implement PageRank on Wikipedia, but I figure that most Wikipedia pages are going to be high quality and what matters most is the relevance of a result to the search query. So, instead I&#8217;ve start generating some user data using using AI.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">I&#8217;m calling my tool a prober. It generates a query intent, translates that into a short query string, sees the results given back by the search system and then evaluates the top five results to see whether they match its original intent. It is currently looking at the first 1000 bytes of markdown for each result, but I&#8217;m going to have it change to look at the whole page if it&#8217;s not too expensive.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Here&#8217;s an example prober result:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"895\" src=\"https:\/\/byte64.com\/wp-content\/uploads\/2026\/03\/Screenshot-2026-03-18-at-2.01.17-p.m-1024x895.png\" alt=\"\" class=\"wp-image-84\" srcset=\"https:\/\/byte64.com\/wp-content\/uploads\/2026\/03\/Screenshot-2026-03-18-at-2.01.17-p.m-1024x895.png 1024w, https:\/\/byte64.com\/wp-content\/uploads\/2026\/03\/Screenshot-2026-03-18-at-2.01.17-p.m-300x262.png 300w, https:\/\/byte64.com\/wp-content\/uploads\/2026\/03\/Screenshot-2026-03-18-at-2.01.17-p.m-768x671.png 768w, https:\/\/byte64.com\/wp-content\/uploads\/2026\/03\/Screenshot-2026-03-18-at-2.01.17-p.m-1536x1343.png 1536w, https:\/\/byte64.com\/wp-content\/uploads\/2026\/03\/Screenshot-2026-03-18-at-2.01.17-p.m.png 1620w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">You can find more examples here:<\/p>\n\n\n\n<p class=\"has-text-align-center wp-block-paragraph\"><a href=\"https:\/\/byte64.com\/search\/#prober\">https:\/\/byte64.com\/search\/#prober<\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">You can see the intent which is separate from the query. The intent is used to judge each of the results. For this query, we got quite lucky that the main python pages were in the top five results. You&#8217;ll notice that there&#8217;s a warning about &#8220;language&#8221;. I&#8217;ve excluded that from my search index keys because it appears on every single wikipedia page. I&#8217;m going to need a more careful parser to include these words when they&#8217;re important to the page (e.g. <a href=\"https:\/\/en.wikipedia.org\/wiki\/English_language\">https:\/\/en.wikipedia.org\/wiki\/English_language<\/a>) and skip them only when they&#8217;re not relevant.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">It&#8217;s quite a new development to have AI in these parts of search. AI has been used for a long time to tune various parts of the scoring function, but it&#8217;s new to have it judge relevance or quality. Typically, to evaluate the quality of results, search companies outsource the work to human raters which are provided with a scoring guide. Those guides say out to rate relevance of a specific result to a query, whether one result is better than another, how authoritative a source is, etc. Now, we can have a LLM read a scoring guide and do this evaluation faster, cheaper and hopefully with comparable accuracy.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Here&#8217;s the basic requests I&#8217;m making to generate the intent and evaluate the relevance:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>let search_prompt = \"You are simulating a user browsing Wikipedia. Pick a random, specific topic (not too broad like 'history', but something like 'Apollo 11 mission' or 'Python programming language'). Write a brief sentence describing your search intent, and then provide a 2 to 4 word search query based on that intent.\";\n\nlet eval_prompt = format!(\n    \"Original Search Intent: {}\\n\\n\"\n  + \"Document Snippet (first 10000 chars):\\n---\\n{}\\n---\\n\\n\"\n  + \"Please evaluate how well this document matches the user's intent. Provide a relevance score from 0.0 to 1.0 and a brief explanation.\",\n    gemini_data.intent, snippet);<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">One thing I haven&#8217;t figured out yet is how to make the search intent generate something new each time. I have it running every 20 minutes and we&#8217;ll see how often it asks about the Apollo missions. I may end up randomly picking documents and ask it to form an intent and query for that document.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Where to go next? The results of the prober are provided by the prober itself, but a search engine should do its own logging to understand user behavior. Soon, I&#8217;ll build logs on the server side to see who&#8217;s querying, about what, and whether they&#8217;re getting their desired results.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">~Andrew<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n","protected":false},"excerpt":{"rendered":"<p>A search engine depends on a feedback loop of users making queries, following links, returning to the search page and rewriting their queries. All these feed into an understanding of whether they&#8217;re finding the results they&#8217;re looking for. Bootstrapping a system like this is difficult because you don&#8217;t start with any users and your search [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6,18],"tags":[21,7],"class_list":["post-83","post","type-post","status-publish","format-standard","hentry","category-coding-with-ai","category-search","tag-llm","tag-wikipedia"],"_links":{"self":[{"href":"https:\/\/byte64.com\/index.php?rest_route=\/wp\/v2\/posts\/83","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/byte64.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/byte64.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/byte64.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/byte64.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=83"}],"version-history":[{"count":4,"href":"https:\/\/byte64.com\/index.php?rest_route=\/wp\/v2\/posts\/83\/revisions"}],"predecessor-version":[{"id":88,"href":"https:\/\/byte64.com\/index.php?rest_route=\/wp\/v2\/posts\/83\/revisions\/88"}],"wp:attachment":[{"href":"https:\/\/byte64.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=83"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/byte64.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=83"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/byte64.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=83"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}