Querying Document Sets Using SPSiteDataQuery
In SharePoint 2010, a document set is just what it is shouting at us- a bundle of documents! Agnes (SharePoint MVP) has a nice and short post here about how to setup document sets. If you activate a site collection scoped feature called “Document ID Service”, document sets, like any other document in a site collection, are assigned a unique ID (i.e. “DHSD-5-14”) which can be used to retrieve them independent of their location in the site collection. An awesome feature of SP2010!
A pretty common integration scenario is when a BizTalk orchestration (or CRM Workflow) needs to pass a document set id to your WCF service (installed in ISAPI folder in the WFE) and your service needs to return the document library URL based on that document set id. This way BizTalk can upload multiple documents to that document set. Let’s assume that the document library and the document set are already created and BizTalk is aware of them.
How would you go about this?
I haven’t found an easy way to do this through SharePoint object model .Obviously, one inefficient way to do this is to loop through all the document libraries and find the document set , like below:
[CSharp]
using (SPSite site = new SPSite(“http://foo”))
{
using (SPWeb web = site.OpenWeb())
{
//Bad Code
foreach (SPList list in web.Lists) {
if (list.ContentTypesEnabled)
{
SPContentType spCtype = list.ContentTypes[“Document Set”];
foreach (SPListItem item in list.Items)
{
if (item[“Document ID”].ToString().Contains(“DHSD-5-14”))
string url = string.Format(“{0}/{1}”, list.ParentWeb.Url, list.Title);
}
}
} //End of SPWeb using
}//End of SPSite using
[/CSharp]
As you can see, when you activate Document ID Service, each document set (they are SPListIItem at the end of the day, right?), gets a new column called “Document ID” which by examining the SPListItem[“Document ID”], I am finding the one that I am looking for.
Another (and better) way is to use the SPSiteDataQuery to query the entire SPWeb (and all the child subsites) for that document set.
[CSharp]
using (SPSite site = new SPSite(“http://foo”)) )
{
using (SPWeb web = site.OpenWeb())
{
SPSiteDataQuery query = new SPSiteDataQuery();
query.Webs = “
query.Lists = “
query.Query = “
query.Query += “
query.Query += “
query.Query += “
DataTable dt = web.GetSiteData(query);
DataView dv = new DataView(dt);
if (dt.Rows.Count == 0) { // No donuts! }
DataRow dr = dt.Rows[0]; //There is always one row, if any!
Guid listId = new Guid(Convert.ToString(dr[“ListID”]));
SPList targetDocLib = web.Lists[listId];
string url = string.Format(“{0}/{1}”, targetDocLib.ParentWeb.Url, targetDocLib.Title);
} //End of SPWeb using
} //End of SPSite using
[/CSharp]
If you have worked with SPSiteDataQuery before, most likely you are aware that it’s a bad ass API. A simple lowercase or or spelling mistake either results in errors or(even worse) returning no result. In my case I had issues querying document sets.
As you can tell, I am not passing any ViewFields to the SPSiteDataQuery. These are not required as ListID, as well as WebID and SPListItem ID, are returned by default. You will also notice that I haven’t looped through all of the returned data rows, since if my query returns any result, there should be only one document set with the id of “DHSD-5-14”.
The most important thing, thought, is the use of _dlc_DocId in the query which is being referenced in the where clause of my CAML query. This is the static name for the Document ID column and took me a little while to figure it ou. Thankfully, using the SPSiteDataQuery class, I was able to quickly find my document set and extract the URL of its parent document library.