Description
How to programmatically search and query content from a Plone site.
Plone uses portal_catalog tool to perform most of content related queries. Special catalogs, like reference_catalog, exists, for specialized and optimized queries.
Plone queries are performed using portal_catalog tool which is available at the site root.
Example:
# portal_catalog is defined in the site root
portal_catalog = site.portal_catalog
You can also use ITools tool to get access to portal_catalog if you do not have Plone site object directly availble:
context = aq_inner(self.context)
tools = getMultiAdapter((context, self.request), name=u'plone_tools')
portal_url = tools.catalog()
There is also a third way, using traversing. This is discouraged, as this includes extra processing overhead:
# Use magical Zope acquisition mechanism
portal_catalog = context.portal_catalog
...and the same in TAL template:
<div tal:define="portal_catalog context/portal_catalog" />
Calling this persistent objectis a shortcut to query method itself, as it provides __call__() method. Each call argument is name of the index and argument value is what the index should contain:
# The following call does not return the actual objects,
# but brains instead
# The call takes list of indices and match values as arguments
brains = portal_catalog(creator="Get all objects with Title "Foobar")
for brain in brains:
print "Name:" + brain["Title"] + " URL:" + brain.getURL()
Note that values can be special depending, on the queried index. Here is a path query:
# return myfolder and first level of child content
brains = portal_catalog(path={ "query": "/myploneinstance/myfolder", depth : 2})
If you call portal_catalog() without arguments it will return all indexed content objects:
# Print all content on the site
all_brains = portal_catalog()
for brain in all_brains:
print "Name:" + brain["Title"] + " URL:" + brain.getURL()
You can use Python slice operator.
Example: getting 10 latest modified content items on the site:
brains = portal_catalog(sort_on="modified", sort_order="reverse")[0:10]
for brain in brains:
print brain["Title"] + " " + brain["ModificationDate"]
portal_catalog queries return iterable of catalog brain objects.
Brains contain subset of the actual content object information. Available subset is defined by metadata columns in portal_catalog. You can see available metadata columns on portal_catalog “Metadata” in ZMI. For more information, see indexing.
You can access the brain object information by index name using Python dictionary look-up:
# Get the indexed Title of an portal_catalog entry i.e. brain
title = brain["Title"]
Result ID (RID) is given with the brain object and you can use this ID to query further info about the object from the catalog.
Example:
(Pdb) brain.getRID()
872272330
To see what metadata columns a brain object contain, you can access this information from __record_schema__ attribute which is a dict.
Example:
for i in brain.__record_schema__.items(): print i
('startDate', 32)
('endDate', 33)
('Title', 8)
('color', 31)
('data_record_score_', 35)
('exclude_from_nav', 13)
('Type', 9)
('id', 19)
('cmf_uid', 29)
ノート
TODO: What those numbers represent?
portal_catalog() query returns indexed brain objects. If you want to get the actual object, from which the search data was indexed, use the following:
# Load the actual object from the database (SLOW!)
# and modify it
object = brain.getObject()
object.setSomething("foobar")
ノート
Calling getObject() has performance implications. Waking up each object needs a separate query to the database.
You cannot call getObject() for restricted result, even in trusted code.
Instead you need to use:
unrestrictedTraverse(brain.getPath())
For more information, see
Example:
# Return object physical path (in the database) -
# this will include Plone site id inside Zope application server
path = brain.getPath()
Since most indexes use Archetypes accessors to index the field value, the returned text is UTF-8 encoded. This is limitations inherid from the early ages of Plone.
To get unicode value for e.g. title you need to do the following:
title = brain["Title"]
title = title.decode("utf-8")
if title[0] == u"a*":
# Unicode text matching etc. functions work correctly now
pass
Normally you don’t get copy of indexed data with brains, only metadata. You can still access the raw indexed data if you know what you are doing by using RID of the brain object.
Example:
(Pdb) data = self.context.portal_catalog.getIndexDataForRID(872272330)
(Pdb) for i in data.items(): print i
('Title', ['ulkomuseon', 'tarinaopastukset'])
('effectiveRange', (21305115, 278752140))
('object_provides', ['Products.CMFCore.interfaces._content.IDublinCore', 'Products.ATContentTypes.interface.interfaces.IHistoryAware', 'AccessControl.interfaces.IOwned', 'OFS.interfaces.ITraversable', 'plone.portlets.interfaces.ILocalPortletAssignable', 'Products.Archetypes.interfaces._base.IBaseObject', 'zope.annotation.interfaces.IAttributeAnnotatable', 'vs.event.interfaces.IVSEvent', 'Products.CMFCore.interfaces._content.IMutableMinimalDublinCore', 'OFS.interfaces.IPropertyManager', 'OFS.interfaces.IZopeObject', 'AccessControl.interfaces.IRoleManager', 'zope.annotation.interfaces.IAnnotatable', 'Acquisition.interfaces.IAcquirer', 'Products.ATContentTypes.interface.event.IATEvent', 'OFS.interfaces.ICopySource', 'Products.LinguaPlone.interfaces.ITranslatable', 'Products.ATContentTypes.interface.interfaces.ICalendarSupport', 'Products.ATContentTypes.interface.interfaces.IATContentType', 'plone.app.iterate.interfaces.IIterateAware', 'Products.Archetypes.interfaces._base.IBaseContent', 'Products.CMFCore.interfaces._content.ICatalogableDublinCore', 'Products.CMFDynamicViewFTI.interface._base.IBrowserDefault', 'Products.Archetypes.interfaces._referenceable.IReferenceable', 'plone.locking.interfaces.ITTWLockable', 'plone.app.imaging.interfaces.IBaseObject', 'persistent.interfaces.IPersistent', 'webdav.interfaces.IDAVResource', 'AccessControl.interfaces.IPermissionMappingSupport', 'OFS.interfaces.ISimpleItem', 'plone.app.kss.interfaces.IPortalObject', 'plone.app.kss.interfaces.IContentish', 'archetypes.schemaextender.interfaces.IExtensible', 'App.interfaces.IUndoSupport', 'OFS.interfaces.IManageable', 'App.interfaces.IPersistentExtra', 'Products.CMFCore.interfaces._content.IMutableDublinCore', 'Products.Archetypes.interfaces._athistoryaware.IATHistoryAware', 'dateable.kalends.IRecurringEvent', 'OFS.interfaces.IItem', 'zope.interface.Interface', 'OFS.interfaces.IFTPAccess', 'Products.CMFDynamicViewFTI.interface._base.ISelectableBrowserDefault', 'webdav.interfaces.IWriteLock', 'Products.CMFCore.interfaces._content.IMinimalDublinCore', 'Products.CMFCore.interfaces._content.IDynamicType', 'Products.CMFCore.interfaces._content.IContentish'])
('Type', u'VSEvent')
('id', 'ulkomuseon-tarinaopastukset')
('cmf_uid', 2)
('recurrence_days', [733960, 733981, 733974, 733967])
('end', 1077028380)
('Description', ['saamelaismuseon', 'ulkomuseossa', ...
('is_folderish', False)
('getId', 'ulkomuseon-tarinaopastukset')
('start', 1077028380)
('is_default_page', False)
('Date', 1077036795)
('review_state', 'published')
('Language', <LanguageIndex.IndexEntry id 872272330 language fi, cid 8b9a08c216b8e086f3446775ad71a748>)
('portal_type', 'VSEvent')
('expires', 1339244460)
('allowedRolesAndUsers', ['Anonymous'])
('getObjPositionInParent', 10)
('path', '/siida/sisalto/8-vuodenaikaa/ulkomuseon-tarinaopastukset')
('in_reply_to', '')
('UID', '8b9a08c216b8e086f3446775ad71a748')
('Creator', 'admin')
('effective', 1077036795)
('getRawRelatedItems', [])
('getEventType', [])
('created', 1077036792)
('modified', 1077048720)
('SearchableText', ['ulkomuseon', 'tarinaopastukset', ...
('sortable_title', 'ulkomuseon tarinaopastukset')
('meta_type', 'VSEvent')
('Subject', [])
You can also directly access a single index:
# Get event brain result id
rid = event.getRID()
# Get list of recurrence_days indexed value.
# ZCatalog holds internal Catalog object which we can directly poke in evil way
# This call goes to Products.PluginIndexes.UnIndex.Unindex class and we
# read the persistent value from there what it has stored in our index
# recurrence_days
indexed_days = portal_catalog._catalog.getIndex("recurrence_days").getEntryForObject(rid, default=[])
Following is useful in unit test debugging:
# Print all objects visible to the currently logged in user
for i in portal_catalog(): print i.getURL()
ノート
Security: All portal_catalog queries are limited to the current user permissions by default.
If you want to bypass this restrictions, use method unrestrictedSearchResults().
Example:
# Print absolute content of portal_catalog
for i in portal_catalog.unrestrictedSearchResults(): print i.getURL()
ノート
All portal_catalog() queries are limited to the selected language of current user. You specially need to bypass language check if you want to do multilingual queries.
Example how to bypass language check:
all = portal_catalog(language="ALL")
Plone and portal_catalog has a mechanism to list only active (non-expired) content by default.
Below is an example how the expired content check is made:
mtool = context.portal_membership
show_inactive = mtool.checkPermission('Access inactive portal content', context)
contents = context.portal_catalog.queryCatalog(show_inactive=show_inactive)
See also:
* :doc:`Listing </content/listing>`
警告
Usually if you pass in None as the query value, it will match all the objects instead of zero objects.
ノート
TODO: How to query None values?
ExtendedPathIndex is the index used for content object paths. Path index stores the physical path of the objects.
** Warning: ** If you ever rename your Plone site instance, path index needs to be rebuild.
Example:
portal_catalog(path={ "query": "/myploneinstance/myfolder" }) # return myfolder and all child content
KeywordIndex index type indexes list of values. It is used e.g. by Plone’s categories (subject) feature and object_provides` provided interfaces index.
You can either query
The index of the catalog to query is either the name of the keyword argument, a key in a mapping, or an attribute of a record object.
Attributes of record objects
Below is an example of matching any of multiple values gives as a Python list in KeywordIndex. It queries all event types and recurrence_days KeywordIndex must match any of given dates:
# Query all events on the site
# Note that there is no separate list for recurrent events
# so if you want to speed up you can hardcode
# recurrent event type list here.
matched_recurrence_events = self.context.portal_catalog(
portal_type=supported_event_types,
recurrence_days={
"query":recurrence_days_in_this_month,
"operator" : "or"
})
To get all catalog brains of certain content type on the whole site:
campaign_brains = self.context.portal_catalog(portal_type="News Item")
To see available type names, visit in portal_types tool in ZMI.
By default, the portal_catalog query does not care about the workflow state. You might want to limit the query to published items.
Example:
campaign_brains = self.context.portal_catalog(portal_type="News Item", review_state="published")
review_state is a portal_catalog index which reads portal_workflow variable “review_state”. For more information, see what portal_workflow tool Content tab in ZMI contains.
The following view snippet allows you to get one random item on the site:
import random
def getRandomCampaign(self):
"""
"""
campaign_brains = self.context.portal_catalog(portal_type="CampaignPage", review_state="published")
# Filter out the current item which we have
bad_ids = [ "you", "might", "want to black list some ids here" ]
items = [ brain for brain in campaign_brains if brain["getId"] not in bad_ids ]
# Check that we have items left after filtering
items = list(items)
if len(items) >= 1:
# Pick one
chosen = random.choice(items)
return chosen.getObject()
else:
# Fallback to the current content item if no random options available
return self.context
The following examples demonstrate how to do range based queries. This is useful if you want to find the “minimum” or “maximum” values of something, the example assumes that there is an index called ‘getPrice’.
Get a value that is greater than or equal to 2:
items = portal_catalog({'getPrice':{'query':2,'range':'min'}})
Get a value that is less than or equal to 40:
items = portal_catalog({'getPrice':{'query':40,'range':'max'}})
Get a value that falls between 2 and 1000:
items = portal_catalog({'getPrice':{'query':[2,1000],'range':'min:max'}})
See DateIndex.
Example:
items = portal_catalog(effective_date = {'date': {'query':(DateTime('2002-05-08 15:16:17'),
DateTime('2062-05-08 15:16:17')),
'range': 'min:max'})
Another example how to get news items for a particular year in the template code:
<div metal:fill-slot="main" id="content-news"
tal:define="boundLanguages here/portal_languages/getLanguageBindings;
prefLang python:boundLanguages[0];
DateTime python:modules['DateTime'].DateTime;
start_year request/year| python: 2004;
end_year request/year| python: 2099;
start_year python: int(start_year);
end_year python: int(end_year);
results python:container.portal_catalog(
portal_type='News Item',
sort_on='Date',
sort_order='reverse',
review_state='published',
id=prefLang,
created={ 'query' : [DateTime(start_year,1,1), DateTime(end_year,12,31)], 'range':'minmax'}
);
results python:[r for r in results if r.getObject()];
Batch python:modules['Products.CMFPlone'].Batch;
b_start python:request.get('b_start',0);
portal_discussion nocall:here/portal_discussion;
isDiscussionAllowedFor nocall:portal_discussion/isDiscussionAllowedFor;
getDiscussionFor nocall:portal_discussion/getDiscussionFor;
home_url python: mtool.getHomeUrl;
localized_time python: modules['Products.CMFPlone.PloneUtilities'].localized_time;">
...
</div>
You can query by language:
portal_catalog({"Language":"en"})
ノート
Products.LinguaPlone must be installed.
See AdvancedQuery.
Example:
from Products import AdvancedQuery
portal_catalog = self.portal_catalog # Acquire portal_catalog from higher hierarchy level
path = self.getPhysicalPath() # Limit the search to the current folder and its children
# object.getPhysicalPath() returns the path as tuples of path parts
# Convert path to string
path = "/".join(path)
# Limit search to path in the current contex object and
# match all children implementing either of two interfaces
# AdvancedQuery operations can be combined using Python expressions & | and ~
# or AdvancedQuery objects
query = AdvancedQuery.Eq("path", path) & (AdvancedQuery.Eq("getMyIndexGetter1", "foo") | AdvancedQuery.Eq("getMyIndexGetter2", "bar"))
# The following result variable contains iterable of CatalogBrain objects
results = portal_catalog.evalAdvancedQuery(query)
# Convert the catalog brains to a Python list containing tuples of object unique ID and Title
pairs = []
for nc in results:
pairs.append((nc["UID"], nc["Title"]))
# query = Eq("path", diagnose_path) & Eq("SearchableText", text_query_target)
query = Eq("path", diagnose_path) & Eq("SearchableText", text_query_target)
return self.context.portal_catalog.evalAdvancedQuery(query)
portal_catalog query takes sort_on argument which tells the index used for sorting. sort_order defines sort direction. It can be string “reverse”.
Sorting is supported only on FieldIndexes. Due to nature of searchable text indexes (they index split text, not strings) they cannot be used for sorting. For example, to do sorting by title, an index called sortable_tite should be used.
Example how to sort by id:
results = context.portal_catalog.searchResults(sort_on="id",
portal_type="Document",
sort_order="reverse")
ZCatalog has uniqueValuesFor() method to retrieve all unique values for a certain index. It is intended to work on FieldIndexes only.
Example:
# getArea() is Archetype accessor for area field
# which is a string and tells the contet area.
# Custom getArea FieldIndex indexes these values
# to portal catalog.
# The following line gives all area values
# inputted on the site.
areas = portal_catalog.uniqueValuesFor("getArea")