Efficient file path indexing for a content repository

التفاصيل البيبلوغرافية
العنوان: Efficient file path indexing for a content repository
Patent Number: 11487,707
تاريخ النشر: November 01, 2022
Appl. No: 13/460391
Application Filed: April 30, 2012
مستخلص: Techniques for indexing file paths of items in a content repository may include querying, by at least one processor, a content repository stored on at least one computer readable storage medium for one or more items that qualify for file path indexes, do not have the file path indexes, and have a parent folder that has a file path index, wherein the querying does not depend on results from previous queries, and wherein the file path index indicates an associated item's location in a folder tree, creating, by the at least one processor, the file path indexes for resulting items from the querying, and, if the querying results in at least one resulting item, repeating the querying of the content repository and the creating of the file path indexes until the querying results in zero resulting items.
Inventors: Victor, David Brian (Gilroy, CA, US)
Assignees: INTERNATIONAL BUSINESS MACHINES CORPORATION (Armonk, NY, US)
Claim: 1. A method comprising: maintaining file path indexes for file paths of items in a content repository, stored on at least one computer readable storage device, wherein the items in the content repository comprising one of indexable content and non-indexable content, wherein the items of indexable content comprise files and folders represented in a hierarchy of folders that are configured to have file path indexes to uniquely identify the files and the folders in the hierarchy of folders, and wherein the items of non-indexable content comprising files and folders not represented in the hierarchy of folders; querying, by at least one processor, the content repository for a root folder and an unfiled folder tree, wherein an unfiled folder tree has items that do not have file path indexes; creating file path indexes for items resulting from the query; querying, by the at least one processor, the hierarchy of folders for an item that qualifies for a file path index by comprising indexable content, does not have a file path index, and resides in a folder that has a file path index; and creating, by the at least one processor, a file path index for the item resulting from the querying, wherein the created file path index indicates a location for the item in the hierarchy of folders.
Claim: 2. The method of claim 1 , wherein the content repository does not natively support file paths.
Claim: 3. The method of claim 2 , wherein the content repository comprises a relational database.
Claim: 4. The method of claim 1 , wherein a folder tree includes a plurality of folders and files in the hierarchy of folders returned as a result of the querying the hierarchy of folders.
Claim: 5. The method of claim 1 , wherein the querying the hierarchy of folders returns any files that qualify for a file path index, does not have a file path index, and are not more than a single level in the hierarchy of folders below a folder having a file path index.
Claim: 6. The method of claim 5 , wherein a further iteration of the querying the hierarchy of folders and creating a file path index are performed, in response to the creating the file path index, to determine whether there are any further files below the files in the hierarchy of folders returned in response to the query that do not have a file path index and that qualify for a file path index.
Claim: 7. A non-transitory computer readable storage device containing instructions that, when executed on at least one programmable processor, cause the at least one programmable processor to perform operations comprising: maintaining file path indexes for file paths of items in a content repository, stored on at least one computer readable storage device, wherein the items in the content repository comprising one of indexable content and non-indexable content, wherein the items of indexable content comprise files and folders represented in a hierarchy of folders that are configured to have file path indexes to uniquely identify the files and the folders in the hierarchy of folders, and wherein the items of non-indexable content comprising files and folders not represented in the hierarchy of folders; querying, by at least one processor, the content repository for a root folder and an unfiled folder tree, wherein an unfiled folder tree has items that do not have file path indexes; creating file path indexes for items resulting from the query; querying, by the at least one processor, the hierarchy of folders for an item that qualifies for a file path index by comprising indexable content, does not have a file path index, and resides in a folder that has a file path index; and creating, by the at least one processor, a file path index for the item resulting from the querying, wherein the created file path index indicates a location for the item in the hierarchy of folders.
Claim: 8. The non-transitory computer readable storage device of claim 7 , wherein the content repository does not natively support file paths.
Claim: 9. The non-transitory computer readable storage device of claim 8 , wherein the content repository comprises a relational database.
Claim: 10. The non-transitory computer readable storage device of claim 7 , wherein a folder tree includes a plurality of folders and files in the hierarchy of folders returned as a result of the querying.
Claim: 11. The non-transitory computer readable storage device of claim 7 , wherein the querying returns any files that qualify for a file path index, does not have a file path index, and are not more than a single level in the hierarchy of folders below a folder having a file path index.
Claim: 12. The non-transitory computer readable storage device of claim 11 , wherein a further iteration of the operations of querying the hierarchy of folders and creating a file path index are performed, in response to the creating the file path index, to determine whether there are any further files below the files in the hierarchy of folders returned in response to the query that do not have a file path index and that qualify for a file path index.
Claim: 13. A computing system comprising: one or more processors; and an indexer operable on the one or more processors and configured to: maintain file path indexes for file paths of items in a content repository, stored on at least one computer readable storage device, wherein the items in the content repository comprising one of indexable content and non-indexable content, wherein the items of indexable content comprise files and folders represented in a hierarchy of folders that are configured to have file path indexes to uniquely identify the files and the folders in the hierarchy of folders, and wherein the items of non-indexable content comprising files and folders not represented in the hierarchy of folders; querying, by at least one processor, the content repository for a root folder and an unfiled folder tree, wherein an unfiled folder tree has items that do not have file path indexes; creating file path indexes for items resulting from the query; query the hierarchy of folders representing for an items that qualifies for a file path index by comprising indexable content, does not have a file path index, and resides in a folder that has a file path index; and create a file path index for the item resulting from the query, wherein the created file path index indicates a location for the item in the hierarchy of folders.
Claim: 14. The computing system of claim 13 , wherein the content repository does not natively support file paths.
Claim: 15. The computing system of claim 14 , wherein the content repository comprises a relational database.
Claim: 16. The computing system of claim 13 , wherein a folder tree includes a plurality of folders and files in the hierarchy of folders returned as a result of the querying.
Claim: 17. The computing system of claim 13 , wherein the querying returns any files that qualify for a file path index, does not have a file path index, and are not more than a single level in the hierarchy of folders below a folder having a file path index.
Claim: 18. The computing system of claim 17 , wherein a further iteration of the querying the hierarchy of folders and creating a file path index are performed, in response to the creating the file path index, to determine whether there are any further files below the files in the hierarchy of folders returned in response to the query that do not have a file path index and that qualify for a file path index.
Patent References Cited: 6067541 May 2000 Raju
6330567 December 2001 Chao
6427123 July 2002 Sedlar
6654734 November 2003 Mani
6920458 July 2005 Chu
7383276 June 2008 Lomet
7584460 September 2009 Broberg et al.
7660808 February 2010 Brechner et al.
7769744 August 2010 Waas et al.
7770123 August 2010 Meyer
7831591 November 2010 Masuda
7873262 January 2011 Shibata et al.
8015165 September 2011 Idicula et al.
8037054 October 2011 Brawer et al.
8126944 February 2012 McArdle
8401522 March 2013 Crawford et al.
8495619 July 2013 Tammana
8914356 December 2014 Victor
9323761 April 2016 Victor
2002/0083054 June 2002 Peltonen
2004/0024778 February 2004 Cheo
2004/0088306 May 2004 Murthy
2004/0133564 July 2004 Gross
2004/0168084 August 2004 Owen
2005/0022155 January 2005 Broberg et al.
2005/0050107 March 2005 Mane et al.
2005/0165760 July 2005 Seo
2005/0228791 October 2005 Thusoo et al.
2005/0246310 November 2005 Chang et al.
2006/0064412 March 2006 Cunningham et al.
2006/0074964 April 2006 Pallapotu
2006/0095446 May 2006 Butler
2006/0161591 July 2006 Huang et al.
2006/0167928 July 2006 Chakraborty
2006/0212457 September 2006 Pearce et al.
2007/0006217 January 2007 Tammana
2007/0073663 March 2007 McVeigh
2007/0118561 May 2007 Idicula et al.
2007/0136382 June 2007 Idicula
2007/0150434 June 2007 Takakura
2007/0156842 July 2007 Vermeulen
2007/0168327 July 2007 Lindblad et al.
2007/0168363 July 2007 Inaba et al.
2007/0203875 August 2007 Cave et al.
2007/0226235 September 2007 Fuh
2007/0233647 October 2007 Rawat
2007/0276807 November 2007 Chen et al.
2008/0046457 February 2008 Haub et al.
2008/0071805 March 2008 Mourra et al.
2008/0114803 May 2008 Chinchwadkar et al.
2008/0147614 June 2008 Tam
2008/0177701 July 2008 Merritt
2008/0195635 August 2008 Chand et al.
2008/0235252 September 2008 Sakai
2008/0256090 October 2008 Dietterich
2008/0313155 December 2008 Atchison et al.
2008/0313260 December 2008 Sweet et al.
2009/0112911 April 2009 Chu
2009/0187581 July 2009 Delisle et al.
2009/0187797 July 2009 Raynaud-Richard
2010/0010967 January 2010 Muller
2010/0100544 April 2010 Takeuchi et al.
2010/0161570 June 2010 Novak
2010/0257153 October 2010 Day et al.
2010/0257179 October 2010 Arrouye
2011/0078186 March 2011 Li et al.
2011/0119283 May 2011 Tarachandani
2011/0145216 June 2011 Subramanya
2011/0161291 June 2011 Taleck et al.
2011/0161723 June 2011 Taleck et al.
2012/0016851 January 2012 Hrle et al.
2012/0096036 April 2012 Ebaugh et al.
2012/0158689 June 2012 Doshi
2012/0166425 June 2012 Sharma
2012/0166513 June 2012 Fortune
2012/0173511 July 2012 Eto
2012/0179689 July 2012 Hornkvist
2012/0216260 August 2012 Crawford et al.
2012/0254189 October 2012 Shah
2013/0066929 March 2013 Sedlar et al.
2013/0086127 April 2013 Pogmore
2013/0103693 April 2013 Arikuma
2013/0138629 May 2013 Rehmattullah
2013/0290301 October 2013 Victor
2013/0302015 November 2013 Dini et al.
2014/0109082 April 2014 Kimmet et al.
2014/0181116 June 2014 Wang
1826692 August 2007
2008063275 May 2008





Other References: Paul Lensing et al., “hashFS: Applying Hashing to Optimize File Systems for Small File Reads,” 2010 International Workshop on Storage Network Architecture and Parallel I/Os, IEEE, pp. 33-42 (May 3, 2010). cited by applicant
ip.com et al.; “System and Method for Just-In-Time (JIT) Indexing”, IPCOM000214355D, Jan. 22, 2012 (4 pages). cited by applicant
Cabanac et al, “An Original Usage-based Metrics for Building a Unified View of Corporate Documents,” DEXA'07: Proceedings of the 18th International Conference on Database and Expert Systems Applications, vol. 4653 of LNCS, pp. 202-212, 2007. cited by applicant
U.S. Appl. No. 13/666,798, by David B. Victor, filed Nov. 1, 2012. cited by applicant
U.S. Appl. No. 13/708,684, by David B. Victor, filed Dec. 7, 2012. cited by applicant
Notice of Allowance from U.S. Appl. No. 13/708,684, dated Jan. 29, 2016, 13 pp. cited by applicant
Assistant Examiner: Le, Jessica N
Primary Examiner: Saeed, Usmaan
Attorney, Agent or Firm: Konrad Raynes Davda & Victor LLP
Victor, David W.
رقم الانضمام: edspgr.11487707
قاعدة البيانات: USPTO Patent Grants