Jan O. Pedersen
Bring new information access technologies to product
Phd in Statistics
BA in Statistics
Beginning in late 2002 Yahoo! entered the search
technology space by first purchasing Inktomi (Dec
2002), a high-quality OEM algorithmic search service, and then purchasing
Overture (Oct 2003), the founder of sponsored search advertising. Overture brought with it two additional
search engines: Alta Vista and FAST Internet (
Chief Scientist and VP, Search and Advertising Technology Group
work on marketplace simulation and auction design for the sponsored search
product. Facilitated the hiring of
economists and other experts to deepen our understanding of this technology
area. Co-chaired the Marketplace Design Group who defined the marketplace
rules for the
· Co-developed Strategy for the search team: how to grow search share in the faceof a strong category leader with a powerful brand. Suggested shift in focus from commoditize and distribute to differentiation.
· Initiated Next Generation Search program focused on intent-based search. The program combines higher level query analysis, rich content analysis and muti-phase ranking to sharpen results for specific user intents.
· As part of an initiative to institutionalize a Technology career track within Yahoo!, split the role of Chief Scientist from VP Applied Sciences. Helped select the the VP Applied Sciences and integrated his team into Search.
Chief Scientist, Search and Marketplace
· Instituted the Relevance group, a science team devoted to search algorithm development, from the various advanced development groups brought in by acquisition.
· Scaled the Relevance group from 20 to 90 scientists and engineers.
· Lead the development of several key Web Search Technologies:
· Machine Learned Ranking: a methodology for disciplined continuous improvement of search relevance ranking
· Query Speller: a text mining system that derives from the analysis of query logs very accurate run-time spelling corrections
· Guided the development of technology roadmaps for the five Relevance groups and participated in the development of search mission and strategy as part of the senior management team.
AltaVista was at one point search share leader but its fortunes rapidly declined after the bursting of the internet bubble. I joined AltaVista as part of the turn-around team (with Jim Barnett and John Ellis). The goal was to stabilize search share and solidify the search technology and search operations in order to find a suitable buyer. This was accomplished when AltaVista was purchased by Overture in 2003. Overture was in turn purchased by Yahoo! a few months later
· As the most senior technical contributor, I worked closely with the VP engineering (John Ellis) to set technical direction for the staff of 100 engineers.
· Helped establish and guide an advanced development group of 10 scientists who acted as a shared resource of specialized skills for the engineering organization.
· Chaired a Scientific Advisory Board including Jerry Friedman (Stanford), Hector Garcia Molina (Stanford) and Marti Hearst (UC Berkeley)
· Participated as a member of the senior team in setting company strategy. In particular participated in the acquisition and disclosure discussions with Overture and Yahoo!
· Lead the AltaVista IP effort and initiated the filing of several additional patents covering ranking and other search technologies.
Chief Scientist: Enkata Systems (2002)
Enkata is a CRM analytic firms who specializes in deriving actionable insights from patterns in customer interactions. We developed Text mining technology for use as derived attributes.
Engineering Director: Centrata (2001)
Centrata is a KP-backed startup whose original business plan was to build a p2p infrastructure platform. I was hired to build an Internet search application on this platform. Centrata’s business plan shifted in January 2001 to Datacenter process automation.
VP Engineering: Open Grid (2000)
OpenGrid was a Motorola-backed startup developing an Internet-based application sharing technology similar to Zaplets. Unfortunately, the attempted extension of this technology to the wireless Internet was premature.
Infoseek was one of the first wave Internet search engine companies. I joined Infoseek after it had gone public and experienced its transformation into the Go Network subsequent to the Disney acquisition. I had two roles at Infoseek: the first was Advanced Technology Director reporting to Steve Kirsch the Infoseek founder and Chairman. The second was Director of the core Internet Search and Spidering service for Go Network.
Director, Advanced Technology
· Developed and prototyped several approaches to economically scaling the Infoseek Search Service index through distributed spidering and search. Transferred the resulting technology (code name BFI) to the Search Service organization.
Director, Search and Spidering
· Subsequent to the Disney acquisition of Infoseek, I as responsible for design, engineering, product management and operations of the core Infoseek Search Service within Go Network; an Internet product with an annual budget of $6M, $40M in revenues and 5.3Billion page views.
Verity is the leading vendor of text retrieval software toolkits. I had two roles at Verity: I was hired as manager of the Advanced Technology group. Subsequently I was director of the Server Products group.
Manager, Advanced Technology Group
· Managed a group of 5 phd-level engineers. Responsible for the integration of new component technologies into the core Verity search engine product: clustering, summarization and QBE
· Managed the creation of the Knowledge Organizer (Yahoo! in a box) text categorization product concept.
Director, Server Group
· Managed 12 engineers. Responsible for the Information Server/Agent Server products: release 3.1 and service packs, rearchitected spider, rearchitected push component.
· Responsible for new Knowledge Organizer product development: integrated search and text categorization
Xerox PARC is one of the corporate research centers for Xerox corporation. I first became affiliated with PARC in graduate school when I worked there as a student consultant (that work later formed the basis for my thesis). I had multiple roles at Xerox and at PARC. I was first hired as a system software developer by Xerox AIS, a business unit attempting to commercialize Xerox Interlisp-D. Later as I worked at PARC as a researcher under the aegis of the new Information Access Initiative. Finally I was a research Area Manager responsible for stewarding the Quantitative Content Analysis Group.
Member of the Research Staff
· One of two authors of the Lisp-based TDB text retrieval system: 70,000 lines of code
· Contributed to a high-performance Lisp-based finite-state calculus package
· Developed the PARC Hidden Markov Model-based part-of-speech tagger
Area Manager, Quantitative Content Analysis
· Managed research group of 8 scientist in information access technologies. Area output included 30 patent applications and technology transfer to Xsoft
· Developed the Scatter/Gather cluster-based document browsing paradigm
· Xerox AIS was a PARC spinout devoted to commercializing Xerox Interlisp-D.
Senior Member of the Technical Staff
· Contributed to the Lyric and Medley release of Xerox Common Lisp
· Responsible for arrays, arithmetic and sequence functions
Fifteen issued patents.
For a complete list see http://www.uspto.gov/patft/index.html
Over thirty refereed publications on information access
Top cited publications include:
Cutting, D., J. Kupiec,
J. Pedersen and P. Sibun. 1992. A
Practical Part-of-Speech Tagger. Proc. 3rd
M. A. Hearst and J. O. Pedersen. 1996 Reexamining the cluster hypothesis: Scatter/Gather on retrieval results. ACM SIGIR’96.
D. R. Cutting, D. R. Karger
and J. O. Pedersen. 1993 Constant interaction-time
Scatter/Gather browsing of large document collections. ACM SIGIR’93.
Ramana Rao, Jan O. Pedersen, Marti A. Hearst, Jock D. Mackinlay, Stuart K. Card, Larry Masinter, Per-Kristian Halvorsen, and George G. Robertson, 1995 "Rich Interaction in the Digital Library." In Communications of the ACM.
Program committee member for SIGIR, WWW and CIKM conferences, most recently for SIGIR 2008, WWW 2008, EC 2008 and CIKM 2008
ACM Distinguished Scientist