Content Retriever query parameters and default values

I've been struggling with a query for a bit of time, and I'm trying to determine if I'm running into bugs in the code or just in the way I think.

We added a reusable schema that helps with SEO features. One of the fields (SEOIncludeInSearchResults) is a boolean that is to be checked off to allow a page to be searchable in the Lucene index and to crawlers in general. On the field definition, we checked off the box to make the default value of that field to be true.

In the site search crawler, I'm trying to retrieve one of these pages like so (id is a GUID parameter):


var result = await contentRetriever.RetrievePagesByGuids<T>(
    [id],
    new RetrievePagesParameters
    {
        ChannelName = channelName,
        LanguageName = languageName,
        IsForPreview = false,
        LinkedItemsMaxLevel = 4,
    },
    query => query
        .Where(where => where.WhereTrue(nameof(ISEO.SEOIncludeInSearchResults))
            .Or()
            .WhereNull(nameof(ISEO.SEOIncludeInSearchResults)))
        .TopN(1),
    RetrievalCacheSettings.CacheDisabled);

I would expect that the additional criteria I pass to that call would be applied to the pages with matching GUIDs. That is, it has to be in the list of GUIDs, and SEOIncludeInSearchResults needs to be true or NULL.

The problem is that the SQL this generates has a WHERE clause that looks like this:

WHERE
  [WebPageItemGUID] = '<Some GUID>' AND
  [SEOIncludeInSearchResults] = @SEOIncludeInSearchResults OR
  [SEOIncludeInSearchResults] IS NULL

There's no grouping in that WHERE clause. As a result, the page I get back isn't the one I asked for. Is that the expected behavior?

Beyond that, remember that I checked off the "default this value to true" option when I created the field. I recognize that XByK doesn't explicitly set a value in the database when a field is added, and that means the backing field will be NULL. But to my mind, NULL should be interpreted as "true" in this case because I asked for that as the default. I feel like that was the behavior in K13 when fields were added to page types.

Am I just thinking wrong here?

Somewhat related, I've been troubleshooting a slow query that filters by tags. When I ran the SQL Profile on that, this is the WHERE that appears to be generated when you use WhereContainsTags:

EXISTS (
  SELECT 1
  FROM OPENJSON([NewsAndMediaCategory]) WITH (identifier UNIQUEIDENTIFIER '$.Identifier')
  WHERE identifier IN (
    SELECT [Value]
    FROM @tagIdentifiers
))

That NewsAndMediaCategory field is a taxonomy, and the underlying implementation is an nvarchar(max). There's no way to use an index there, so the entire table gets scanned. In this case, the field is in a reusable schema, so it's the very large CMS_ContentItemCommonData table that's getting scanned! Why wouldn't this is the CMS_ContentItemTag table to filter instead? That seems to be indexed in a way that would help.

Environment

Tags:
Content querying Content types SQL

Answers

To response this discussion, you have to login first.