{"id":338177,"date":"2016-12-19T04:42:31","date_gmt":"2016-12-19T12:42:31","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-project&#038;p=338177"},"modified":"2016-12-19T04:42:50","modified_gmt":"2016-12-19T12:42:50","slug":"sffinx-satori-free-text-facet-ingestion-nlp-extraction","status":"publish","type":"msr-project","link":"https:\/\/www.microsoft.com\/en-us\/research\/project\/sffinx-satori-free-text-facet-ingestion-nlp-extraction\/","title":{"rendered":"SffinX: Satori Free-text Facet Ingestion with Nlp eXtraction"},"content":{"rendered":"<p>=> Mar 2014 : Dec 2015 &#8230;<\/p>\n<p>SffinX aimed to extract\u00a0millions of entities, relations and facts from web pages free text for a specific domain, then, mapping them to Satori ontology (MSO). SffinX targeted low human effort for training by only giving a handful of seed examples and low processing time in order to process millions of web pages and extract millions of entities, relations, and facts in range of hours.<\/p>\n<p>First, SffinX collects a domain specific corpus around the given examples. Then, extracts entities relying on a 22 classes NER and verb-based relations relying a constituancy parser to formulate facts. SffinX then clusters relations lexically and semantically\u00a0and ranks the facts based on signals from a Confidence Scorer. Finally, SffinX maps the entities, the relations, and the facts, if found, to the KB. SffinX has a real-time experience in which entities, relations, and facts are extracted on the fly from news RSS feeds.<\/p>\n<p>SffinX was a three way collaboration from ATL cairo along with Microsoft Satori Extraction team and MSR Asia.<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>=> Mar 2014 : Dec 2015 &#8230; SffinX aimed to extract\u00a0millions of entities, relations and facts from web pages free text for a specific domain, then, mapping them to Satori ontology (MSO). SffinX targeted low human effort for training by only giving a handful of seed examples and low processing time in order to process [&hellip;]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"research-area":[13555],"msr-locale":[268875],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-338177","msr-project","type-msr-project","status-publish","hentry","msr-research-area-search-information-retrieval","msr-locale-en_us","msr-archive-status-active"],"msr_project_start":"2014-03-01","related-publications":[],"related-downloads":[],"related-videos":[],"related-groups":[],"related-events":[],"related-opportunities":[],"related-posts":[],"related-articles":[],"tab-content":[],"slides":[],"related-researchers":[],"msr_research_lab":[],"msr_impact_theme":[],"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/338177","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-project"}],"version-history":[{"count":0,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/338177\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=338177"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=338177"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=338177"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=338177"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=338177"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}