-
Notifications
You must be signed in to change notification settings - Fork 351
API spec review: UserActivityHistory #5260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: feature/UserActivityHistoryAPI
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,266 @@ | ||
UserActivityHistory | ||
=== | ||
|
||
# Background | ||
|
||
The [UserActivity](https://learn.microsoft.com/uwp/api/windows.applicationmodel.useractivities.useractivity) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We need to update the docs on UserActivity as well to drop references to Timeline (yes, still there) and instead reference AI and stuff. |
||
class can be used to note down and preserve a record of activities that the user | ||
is currently doing on their computer - e.g., browsing a website, reading a Word document, etc. | ||
This allows Windows to have insight into the application state, enabling smart experiences that are | ||
built around the semantics of the app. For example, a document editor can give Windows information | ||
about the document that the user is editing, so that Recall can later take the user to the document | ||
at the same location. | ||
|
||
To record user activity, an app uses [UserActivityChannel](https://learn.microsoft.com/en-us/uwp/api/windows.applicationmodel.useractivities.useractivitychannel?view=winrt-26100) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why is this information here, rather than in the docs for how to use UserActivity? The person using this API wants to query existing ones, not necessarily create new ones. |
||
to retrieve a UserActivity object via the API [GetOrCreateUserActivityAsync](https://learn.microsoft.com/en-us/uwp/api/windows.applicationmodel.useractivities.useractivitychannel.getorcreateuseractivityasync?view=winrt-26100#windows-applicationmodel-useractivities-useractivitychannel-getorcreateuseractivityasync(system-string)). | ||
If a UserActivity with the given ID already exists, it will be returned; otherwise, a new UserActivity | ||
object will be created and returned. You can then call the API [GetSession](https://learn.microsoft.com/en-us/uwp/api/windows.applicationmodel.useractivities.useractivity.createsession?view=winrt-26100#windows-applicationmodel-useractivities-useractivity-createsession) | ||
to return a [UserActivitySession](https://learn.microsoft.com/en-us/uwp/api/windows.applicationmodel.useractivities.useractivitysession?view=winrt-26100) | ||
object that tracks how long the user is engaged in that activity. This structure allows multiple | ||
sessions to be associated with the same activity, representing the case where the user completes | ||
that activity a bit at a time - e.g., beginning to watch a movie, then pausing, then watching more later. | ||
These will be treated as the same singular user activity that spans multiple sessions. | ||
|
||
UserActivityHistory is a new set of APIs that allow you to query up to the past 28 days of the | ||
user's activity history, which will enable you to bring back content that the user has previously | ||
been interacting with. | ||
|
||
# Conceptual pages (How To) | ||
|
||
The intended use case of this API is to allow you to make search queries against the user's activity history | ||
on their local computer. The string matching in the query parameters in this API is lexical in nature, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Very wordy. Suggest more like "This API uses simple string matching for searches; it does not support natural language. If you want to support natural language search scenarios, use <some other WCR API?>." |
||
meaning that it is expected that any natural language semantic parsing of the user's input will be done by | ||
your app prior to calling this API. | ||
|
||
For example, if a user types in something along the lines of, "Please find the Korean recipe I was looking at | ||
earlier today", your app might have an agentic AI parse that input and determine that the user is looking | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is "agentic" meaningful here? |
||
for a webpage that contains the keywords "Korean" and "recipe", and construct a query with those keywords, | ||
a content type of "text/html", and an access time within the last 24 hours. | ||
|
||
In order for an app to make use of this API, it must be Windows logo certified, and the user must provide | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. link to "how do I get the Windows logo certification" page? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, need to take into account whatever consent model we end up with - one time, every time, does it include context, etc? |
||
their consent to allow access to their activity history. If either of these is not the case, the API will | ||
throw an exception. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The API will fail - the exception is specific to the language projection. |
||
|
||
# API Pages | ||
|
||
## UserActivityHistory class | ||
|
||
This class provides static methods that enable you to query the user's activity history. | ||
This activity history is stored in a database managed by a local service, and these APIs | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there a reason why we talk about implementation? If the implementation changes, will we break something? |
||
call out to that service to retrieve data from the database. | ||
|
||
Here is an example usage of the class that will enable you to bring back the webpage for a | ||
Korean recipe that the user had previously interacted with within the last day: | ||
|
||
```c# | ||
UserActivityHistoryQuery query = new(); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since this is the first time we're seeing code, can we show the API to request access, too? Are you just relying on the AppCapability class? Although I think the UX for consent is more dynamic, so it is probably part of the API call itself. Big open question. |
||
query.Keywords = new string[] { "Korean", "recipe" }; | ||
llongley marked this conversation as resolved.
Show resolved
Hide resolved
|
||
query.LatestStartTime = DateTime.Now.AddDays(-1); | ||
|
||
IList<UserActivityHistoryItem> results = UserActivityHistory.Search( | ||
new UserActivityHistoryQuery[] { query }, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should this be an overload? Seems strange to have to create an array for a single search item. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. RECOMMEND: Add an overload for 1 query |
||
UserActivityHistoryOrderBy.DwellTime, | ||
maxResults: 1); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why name the parameter? |
||
|
||
UserActivityHistoryItem item = results.FirstOrDefault(); | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. RECOMMEND: Remove blank line |
||
if (item != null) | ||
{ | ||
// Now we can use item.ActivationUri to bring back the webpage in the state in which the user | ||
// was last viewing it. | ||
} | ||
``` | ||
|
||
## UserActivityHistory.Search method | ||
|
||
This method synchronously queries the user's activity history and returns a list of items matching | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we need to mention "synchronously" everywhere? It should be assumed that unless the API ends with "Async" that it is synchronous |
||
the criteria specified in the `queries` parameter. The results are ordered in descending order | ||
according by the `orderBy` parameter: either by the most recent start times, the most recent | ||
end times, or the longest time spent on the activity. | ||
|
||
The results will include the user activity history items that match any one of the queries passed in. | ||
To match a given query, the item must match all of the criteria specified in that query. Any query | ||
property that is left empty or null will be ignored. A case-insensitive lexical search will be | ||
performed on the keywords in the query, which will match if all of the keywords are found somewhere | ||
in the DisplayText property of the item. | ||
|
||
This method will not perform any parsing of the keywords for semantic meaning or natural language - | ||
it is expected that the app will have already performed that step and will pass the result of that | ||
into this API. | ||
|
||
A case-insensitive lexical search will also be performed on the ContentType property of the item, | ||
which is the [MIME type](https://docs.w3cub.com/http/basics_of_http/mime_types/complete_list_of_mime_types.html) | ||
of the resource the user was interacting with. The search will match if the ContentType of the item matches | ||
the ContentType property of the query. The ContentType supports using an asterisk as a wildcard | ||
to match a range of content types - e.g., "image/*" will match "image/png", "image/jpeg", etc. | ||
|
||
## UserActivityHistory.GetAppsWithUserActivity method | ||
|
||
This method synchronously retrieves a list of all the app names with data in the user's activity | ||
history database. You can use this, for example, to show the user the list of apps that are being | ||
queried against, so the user can understand why an app that is not recording user activity is not | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Typo? Should be "...what an app |
||
showing up in the results. | ||
llongley marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
## UserActivityHistoryItem class | ||
|
||
This class represents a single item in the user's activity history. It contains properties that | ||
describe in what app the activity occurred, how the app described the activity, what sort of | ||
resource was being interacted with (e.g., a document, a webpage, a video, etc.), the URI of | ||
the resource involved in the activity, the URI that can be used to bring back the state the user | ||
left the activity in, and the times when the user started and ended the activity. | ||
|
||
If the user performed the same activity multiple times, there will be multiple | ||
`UserActivityHistoryItem` objects returned, each with different start and end times. | ||
|
||
## UserActivityHistoryItem.AppName property | ||
|
||
This property contains the name of the app in which the activity occurred. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what kind of name is this? a PFN? an AUMID? display name? exe path? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My current implementation contains the exe path, but it would be even better if there were a way to get the display name. I can't immediately find one. We have to infer this from the caller; the UserActivity object does not have this property anywhere. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we can use CallerIdentity or similar (e.g. CoGetCallContext) to get an AUMID. When an app receives this AUMID they can find the display name of that AUMID for display purposes. We could choose to store the display name too, because the app might get uninstalled sometime after capturing the user activity and before querying it (what happens with that app's user activity history, does it get deleted?) In any case let's take an action item to update the wording here once we have a solid caller id implementation There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Wrinkle: conversion from PFN to Display Name should happen in which context? Ideally, it is in the CUA's context so it is localized to match the CUA. But Start Menu might show a different localization so you might not be able to find it. I don't know what the right answer is. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Is this an unpackaged app or packaged? If packaged you'll want to record the package full name. Given that you can lookup its DisplayName (and Logo) localized for the current user to view. Is this historical? Does UserActivityHistory retain information recorded by apps after they're uninstalled? If so then you can't guarantee looking up the DisplayName. If so there are options but they have caveats so I'll wait to hear if relevant before saying more.
returns the package's DisplayName localized for the calling user. Are there cases where the package isn't registered for the calling user? |
||
|
||
## UserActivityHistoryItem.ActivityId property | ||
|
||
This property contains the ID of the activity, which can be used to collate multiple sessions | ||
of the same activity. For example, if the user watched a movie in multiple sessions, the | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is watching a video an actual scenario supported by any apps we know that report Activity History? Is it the most interesting one? I would expect a more obvious one would be opening the same Word document 5 times in a week, and them all being related somehow. Or visiting the same website (like your e-mail) every day. And so on. |
||
`ActivityId` property can be used to identify how long in total the user spent watching that movie. | ||
|
||
## UserActivityHistoryItem.DisplayText property | ||
|
||
This property contains a string that is how the app chose to describe the activity. For example, | ||
if the activity was reading the contents of a webpage, this property might contain the webpage's title. | ||
|
||
## UserActivityHistoryItem.ContentType property | ||
|
||
This property contains the MIME type of the content being interacted with. For example, if the user | ||
was looking at a PNG image, this property would contain the string "image/png". | ||
llongley marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Note that this is a string property, not an enum, so apps that populate this property | ||
do not necessarily have to use existing recognized common MIME types. | ||
|
||
## UserActivityHistoryItem.ContentUri property | ||
|
||
This property contains the URI of the content being interacted with. For example, if the user was | ||
looking at a webpage, this property would contain the URI of that webpage. | ||
llongley marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
## UserActivityHistoryItem.ActivationUri property | ||
|
||
This property contains the URI that can be used to bring back the state the user left the activity in. | ||
For example, if the user was looking at a webpage, this property would contain the URI of that webpage | ||
with additional information, such as what the scroll position was, etc. | ||
|
||
If this property is not populated on the UserActivity object, then we won't add it to the user activity | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can the logic be that if any of the properties [ Then ideally for |
||
history database, as we won't be able to bring back that activity. | ||
|
||
## UserActivityHistoryItem.StartTime property | ||
|
||
This property contains the time when the user started the activity session. | ||
|
||
## UserActivityHistoryItem.EndTime property | ||
|
||
This property contains the time when the user ended the activity session. | ||
llongley marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
## UserActivityHistoryQuery class | ||
|
||
This class is used to specify criteria for what portion of the user's activity history you want to | ||
retrieve. It allows you to specify keywords to search for, content types to filter by, and time ranges | ||
to filter by. | ||
|
||
## UserActivityHistoryQuery.Keywords property | ||
|
||
This property is an array of keywords, each of which is used to lexically search against the | ||
llongley marked this conversation as resolved.
Show resolved
Hide resolved
|
||
DisplayText column in the database. Keywords are case-insensitive, and results returned will | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is it an ordinal search, or a search based on a specific locale? If it's locale-sensitive, hopefully it uses the locale of the caller. |
||
be those that contain all of the keywords in the array. | ||
llongley marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
## UserActivityHistoryQuery.ContentType property | ||
|
||
llongley marked this conversation as resolved.
Show resolved
Hide resolved
|
||
This property is a string that specifies the content type associated with the activity you want | ||
to retrieve. It allows the inclusion of an asterisk as a wildcard - e.g., "image/*" will match | ||
all content types beginning with "image/", such as "image/png", "image/jpeg", etc. | ||
This property is case-insensitive. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd make it clear you can leave it null / empty-string to match any content. |
||
|
||
## UserActivityHistoryQuery.EarliestStartTime property | ||
|
||
This is an optional DateTime property that specifies the earliest start time of the activity | ||
you want to retrieve. Any activities with a StartTime property earlier than this will be excluded. | ||
If this property is unspecified, it will be ignored. | ||
|
||
## UserActivityHistoryQuery.EarliestEndTime property | ||
|
||
This is an optional DateTime property that specifies the earliest end time of the activity | ||
you want to retrieve. Any activities with an EndTime property earlier than this will be excluded. | ||
If this property is unspecified, it will be ignored. | ||
|
||
## UserActivityHistoryQuery.LatestStartTime property | ||
|
||
This is an optional DateTime property that specifies the latest start time of the activity | ||
you want to retrieve. Any activities with a StartTime property later than this will be excluded. | ||
If this property is unspecified, it will be ignored. | ||
|
||
## UserActivityHistoryQuery.LatestEndTime property | ||
|
||
This is an optional DateTime property that specifies the latest end time of the activity | ||
you want to retrieve. Any activities with an EndTime property later than this will be excluded. | ||
If this property is unspecified, it will be ignored. | ||
|
||
## UserActivityHistoryOrderBy enum | ||
|
||
This enum specifies what property the results should be ordered by. The options are as follows: | ||
|
||
| Name | Description | | ||
|-|-| | ||
| StartTime | Results will be in descending order of their StartTime property | | ||
| EndTime | Results will be in descending order of their EndTime property | | ||
| DwellTime | Results will be in descending order of the difference between their EndTime and StartTime properties | | ||
|
||
# API Details | ||
|
||
```c# (but really MIDL3) | ||
namespace Microsoft.Windows.ApplicationModel.UserActivities | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Q: Should we put this somewhere else? It's not a general-purpose API that anyone can use. It's specific to AI scenarios and will be VERY locked down as to who can call it. Do we have a top-level "User context stuff useful for AI" namespace? Do we need one? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Beat me to it.
|
||
{ | ||
runtimeclass UserActivityHistory | ||
{ | ||
static IVector<UserActivityHistoryItem> Search( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we have an overload for a single query? |
||
UserActivityHistoryQuery[] queries, | ||
UserActivityHistoryOrderBy orderBy, | ||
UInt32 maxResults); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there a material benefit to passing maxResults (e.g. perf? or making the user feel better if we include this as part of the consent prompt?). |
||
|
||
static IAsyncOperation<IVector<UserActivityHistoryItem> > SearchAsync( | ||
UserActivityHistoryQuery[] queries, | ||
UserActivityHistoryOrderBy orderBy, | ||
UInt32 maxResults); | ||
|
||
static IVector<String> GetAppsWithUserActivity(); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should this be a list of ProductIds rather than just strings? What data do we have from unpackaged apps recording activities (like Office)? |
||
|
||
static IAsyncOperation<IVector<String>> GetAppsWithUserActivityAsync(); | ||
} | ||
|
||
runtimeclass UserActivityHistoryItem | ||
{ | ||
String AppName { get; }; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is If a packaged app is this the app's AUMID (programmatic id) or DisplayName (localized string for human consumption)? Does this API support unpackaged apps? |
||
String ActivityId { get; }; | ||
String DisplayText { get; }; | ||
String ContentType { get; }; | ||
String ContentUri { get; }; | ||
llongley marked this conversation as resolved.
Show resolved
Hide resolved
|
||
String ActivationUri { get; }; | ||
DateTime StartTime { get; }; | ||
DateTime EndTime { get; }; | ||
} | ||
|
||
runtimeclass UserActivityHistoryQuery | ||
{ | ||
UserActivityHistoryQuery(); | ||
|
||
String[] Keywords; | ||
String ContentType; | ||
IReference<DateTime> EarliestStartTime; | ||
IReference<DateTime> EarliestEndTime; | ||
IReference<DateTime> LatestStartTime; | ||
IReference<DateTime> LatestEndTime; | ||
} | ||
|
||
enum UserActivityHistoryOrderBy | ||
{ | ||
StartTime, | ||
EndTime, | ||
DwellTime | ||
}; | ||
} | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a vague concern about poisoning of history here to trick agents into doing bad things. there may be "nothing" here, but we should threat model it out.
The problem is that the user has no visibility into what UserActivityHistory items an app is saving, and the agent is probably dumb enough to be easily tricked by malformed items.
Basically, a low-privileged app (like a UWP) adds a UserActivityHistory item that claims to be something interesting (include a display string with juicy keywords). It also includes a URI that is malicious (note "malicious" might not mean it actively harms the user directly; it might be malicious in the sense that it furthers phishing attempts or something). Now when the user asks Copilot a query, Copilot finds the (fake) UserActivityHistory item and invokes it on behalf of the user, which ends up somewhere "bad."
The malicious app cant't pull this off directly itself, because either launching the bad URI either (1) is blocked by UWP security or (2) would look out-of-place when called directly by the app. But by having it open out of context, is it bad?
(Like I said, kind of a vague concern that may not be unique to agents or to this feature or whatever... just I worry about bad actors poisoning the inputs the CUA reasons over.)