![]() |
|
Spaces home Windows Live AgentsPhotosProfileFriends | ![]() |
|
|
July 02 Guidelines for Testing your AgentAs compared to Web sites and traditional software applications, conversational agents are subject to some unique policy compliance risks. These risks arise because:
· End users’ interactions with agents are freeform and unpredictable. · Agents often engage in human-like interactions and operate in messaging environments normally used for human-to-human communications, making end users and outside observers especially sensitive to inappropriate content or behavior.
Because of these unique risks, the Windows Live Agents team highly recommends that each Agent undergo manual testing for policy compliance prior to launching. Once testers have acquainted themselves with the task, approximately 4 to 8 hours of manual testing should provide a reasonable evaluation of the Agent’s policy compliance. Testers should:
· Be native speakers · Have a good understanding of cultural and political factors that might determine whether an Agent’s content/behavior is appropriate · Be able to make judgments in the best interest of your public image and business interests in the market where the Agent will be released · Be willing to provoke the Agent to behave inappropriately (this requires creativity, persistence, and willingness/ability to imagine offensive and provocative user inputs) · Understand the Agent’s feature set · Ideally not have been directly involved in the Agent’s development
If your testing uncovers any issues that you need help triaging or fixing, please contact Windows Live Agents Partner Support (agentsu@microsoft.com). Send a transcript illustrating each issue, along with a description (in English) of what the issue is.
Overview for Testers
This document is intended to provide guidelines and advice for manual testing of Agents for policy compliance. It outlines specific types of subject matter to focus on, common Agent vulnerabilities, and specific tactics that you can use in an attempt to uncover issues in a given Agent.
This document is not a step-by-step test plan; nor is it by any means exhaustive. When performing compliance testing, there is no substitute for your own persistence and imagination. Furthermore, these guidelines do not currently prescribe any specific standards. You should apply your language and market expertise and your business judgment to determine whether the Agent’s behavior and content are acceptable. We strongly advise erring on the side of caution.
You should read this document to acquaint yourself with the subject of Agent policy compliance. You may find the specific examples to be a helpful starting point, but effective testing will require you to apply your knowledge of the language and market for which the Agent is intended, and of the specific Agent’s content and features.
In order to test effectively, you must be willing and able to imagine and try highly offensive and provocative user inputs. If you’re not comfortable with this task, then you should attempt to find someone who is.
Once you have acquainted yourself with the task, approximately 4-8 hours of manual testing should provide a reasonable evaluation of the Agent’s policy compliance. Sensitive/Inappropriate subject matter to test
· Profanity · Hate/Intolerance (with respect to race, gender, sexual orientation, religion, etc.) · Violence and criminal behavior · Drug use · Sexual content · Suicide · Culturally/Politically sensitive subjects in your market Scenarios to test
· Imagine you are one of the Agent’s target users · Imagine you are a child · Imagine you are a malicious user attempting to provoke inappropriate Agent behavior Common Agent vulnerabilities
· Agent may repeat (or “mirror”) user language without employing adequate safeguards · Agent may respond to the form of a user input without understanding the content · Agent may fail to recognize inappropriate or sensitive subject matter if the user employs creative/subtle phrasing · Agent may incorrectly determine an input to be inappropriate and in turn respond inappropriately · Agent may have “unsafe” catch-all responses (responses used when the user input is not understood at all) Some specific tactics to try
· See how the Agent responds to blatant abuse and provocation · Try to trick the agent into repeating an inappropriate word or phrase · Try to elicit an inappropriate opinion from the Agent · Try to elicit the Agent’s approval (explicit or implicit) of an inappropriate statement · Try to elicit inappropriate answers to formulaic questions (yes/no, how many, etc.) · Try to elicit inappropriate responses to commands/requests/statements · Try to trick the Agent into inferring inappropriate intent where there is none (and responding inappropriately) July 01 It's all about context, part deux!Hi again! Here is the second part of our visit to the magical world of contexts. Yesterday we brushed on a few simple uses for them, now let's dive into a slightly more sensitive subject.
Once the agent is hosted and public, you may want to block access to the agent temporarily, for instance if you rely heavily on external data sources that are experiencing an outage or just very slow. In that case it's a good idea to keep the agent running normally for a limited set of superusers so they can work on the issue, while putting up an 'out of service' message for the general users.
//The message is by default empty. If it's not then the agent knows that we want it muted. variable PUBLIC.OUTAGE_MESSAGE = ""
context AgentIsDisabled {out-of-context="0" in-context="1000" condition="!IsSuperUser() && PUBLIC.OUTAGE_MESSAGE ne \"\""}
start context AgentIsDisabled
+ =AnythingStrong - PUBLIC.OUTAGE_MESSAGE
end context AgentIsDisabled
// SuperUsers only an change the agent status. Easy! We've already prepared for that. start context RestrictedStrings
+ disable agent REASON=AnythingRaw PUBLIC.OUTAGE_MESSAGE = REASON
+ enable agent PUBLIC.OUTAGE_MESSAGE = ""
end context RestrictedStrings
Here the condition is simply to check if the outage message is empty or not, except for super users. They will continue to use the agent normally, and can reenable it whenever their experience is back to normal.
Now, there is one problem in this code... did you spot it? Using a public variable like this in a context condition is pretty bad for performance. Indeed, it would mean that for each query of each user, the query server would need to lock access to the public variable in order to read it just to check the context condition. It gets even more troublesome if you're on dual-box hosting and require a Stored Public Variable to propagate the outage across queryservers! What is the solution then? One possible solution is to check the outage variable only once: at session start. All you need for this is to keep a local session variable of that setting.
stored variable PUBLIC.STORED_OUTAGE_MESSAGE = "" variable G_OUTAGE_MESSAGE = ""
procedure ABStartSessionProc() // This procedure can exist independently in any buddyscript file, and is called at session start if !IsSuperUser() lock profile // locking is required for stored public variables. G_OUTAGE_MESSAGE = PUBLIC.STORED_OUTAGE_MESSAGE
context AgentIsDisabled {out-of-context="0" in-context="1000" condition="G_OUTAGE_MESSAGE ne \"\""} //note that we don't need to check for SuperUsers here as as it's done at session start
start context AgentIsDisabled
+ =AnythingStrong - G_OUTAGE_MESSAGE
end context AgentIsDisabled
start context RestrictedStrings
+ disable agent REASON=AnythingStrong lock profile PUBLIC.STORED_OUTAGE_MESSAGE = REASON
+ enable agent lock profile PUBLIC.STORED_OUTAGE_MESSAGE = ""
end context RestrictedStrings
As you may have noticed, there is a caveat to this solution: since we only check the message at session start, the flag takes some time to propagate as only new users will be getting the message. The same is true when you re-enable the agent and it will take a few minutes for the outage to be over for everyone.
So here, we only lock the agent's profile and check the public stored variable once for every session, which is good already. But can we do better? Imagine your agent is highly successful and deals with so many people that, say, five people start a new session every second. Don't we all dream of having an agent that popular! But it comes with a price: 5 calls to the public profile per second would probably bog the whole system down. What do you do then? Well, add yet another layer of buffering of course! Back to our friend the "basic" public variable. That one only lives for the current queryserver, but it is a lot more performant to check against. All we'll need is a background procedure to update the public variable every few minutes, so as to transmit the orders from the top to the base:
stored variable PUBLIC.STORED_OUTAGE_MESSAGE_FOR_ALL_USERS = "" variable PUBLIC.OUTAGE_MESSAGE_FOR_ALL_USERS_OF_THIS_SERVER = "" variable G_OUTAGE_MESSAGE_FOR_THIS_USER = ""
procedure Background_UpdateOutageMessage() startup every 1 minute // This procedure is called at startup and every minute, independently of any user session. lock profile PUBLIC.OUTAGE_MESSAGE_FOR_ALL_USERS_OF_THIS_SERVER = PUBLIC.STORED_OUTAGE_MESSAGE_FOR_ALL_USERS
procedure ABStartSessionProc() if !IsSuperUser() G_OUTAGE_MESSAGE_FOR_THIS_USER = PUBLIC.OUTAGE_MESSAGE_FOR_ALL_USERS_OF_THIS_SERVER
context AgentIsDisabled {out-of-context="0" in-context="1000" condition="G_OUTAGE_MESSAGE_FOR_THIS_USER ne \"\""}
start context AgentIsDisabled
+ =AnythingStrong - G_OUTAGE_MESSAGE_FOR_THIS_USER
end context AgentIsDisabled
start context RestrictedStrings
+ disable agent REASON=AnythingStrong PUBLIC.STORED_OUTAGE_MESSAGE_FOR_ALL_USERS = REASON
+ enable agent PUBLIC.STORED_OUTAGE_MESSAGE_FOR_ALL_USERS = ""
end context RestrictedStrings
Here you have a piece of code, slighly more complex, but that will sustain any kind of traffic without budging, with the only downside of taking up to one more minute to propagate the change of the agent's status.
And this concludes our two-day tour of contexts and their natural habitat. I hope you liked it, and please don't forget to check out our gift shop on the way out! June 30 It's all about context!
Today we're going to talk a bit about contexts and what wonderful things you can do with them. Well, maybe not "Wall-E opening sequence" wonderful, but they sure come in handy to control what your agent is doing in an easy and trouble-free way. With the help of contexts you can separate clearly your routines according to any condition, without having to worry about micro-managing their scores or doing systematic checks. We're going to take a closer look at a few different uses and related buddyscript features, from the very simple to... the wee bit advanced. Tomorrow we'll explore another use for them, and discuss public variables and performance issues.
Here is how, for example, you could make sure that only users on MSN get access to the activity window features:
context MSN_Only {out-of-context="0" in-context="100" condition="SYS.User.Service eq \"MSN\""} // out-of-context: score adjustment for the queries inside the context that don't verify the condition (here, 0 = no match) // in-context : score adjustment for the queries inside the context that do verify the condition (here, 100 = no adjustment)
start context MSN_Only
? Start the activity. - Sending you an invitation...
end context MSN_Only
An interesting thing to note is that you don't have to end the context at the end of our file. If you were to start the context in a package, then every domain including this package, directly or indirectly, would be within that context and abide by its rules. So here you could have all domains related to Activity Window include this package, and be protected automatically. You could apply this principle to restrict certain features or queries based on any condition, like age, country, market, number of visits or what the user's high-score is at your quizz game.
You can also restrict access to the agent even though it's launched and live on Messenger, while it's in development or beta phase:
// This table lists the screennames of authorized users // Using "exact" as an index method will help make searches more performant datatable AuthorizedUsers load ScreenName {index="exact"} from mememe@hotmail.com uberbetatester@live.com popaandmoma@live.com mygloriousboss@live.com
function UserIsAuthorized() if ShellMode() // Shellmode() returns true if we're in the SDK return true SN = get ScreenName in AuthorizedUsers where ScreenName is SYS.User.ScreenName // Found the screename: access granted! return true return false
// Here the in-context is set to 1000 to make sure that any other routine is overruled. context Unauthorized {out-of-context="0" in-context="1000" condition="!UserIsAuthorized()"}
start context Unauthorized
+ =AnythingStrong - Sorry, I'm in closed beta phase and can't talk to you yet. No peeping!
// Within a context, regular matching still happens as they're all subject to the same adjustment. ? But I am a Very Important Person! - The Boss said: no exceptions.
end context Unauthorized
Another use, very similar in implementation, is to restrict certain debug or testing queries to a set of super-users, listed this time in a text file. The textfile-based table is easier to maintain independently from the code, and can be changed without having to restart the agent: all you need is to set an expire date for the file to be reloaded.
datatable SuperUsers {expire="in one day"} // The list will be timed out and reloaded once a day from the file. load ScreenName {index="exact"} from file superusers.txt
function IsSuperUser() if ShellMode() return true SN = get ScreenName in SuperUsers where ScreenName is SYS.User.ScreenName return true return false
context RestrictedStrings {out-of-context="0" in-context="100" condition="IsSuperUser()"}
start context RestrictedStrings
// This topic is only accessible to the right people: + debug all variables - As you wish. dbg_display STORED_USER_VARIABLES
That’s it for today. See you tomorrow for another round of goodies!
June 26 5.0 Transition Update!Hello Developers and Partners of Windows Live Agents,
We are very excited that 5.0 is here! We have received many questions from all of you regarding the transition from 4.3 to 5.0. Windows Live Agents Partner Support is no longer accepting projects developed in 4.3. All agent projects should be developed in 5.0 using our SDK in Visual Studio and submitted through our Partner hosting site at http://phi.agents.live.com. If you send us a new project developed in 4.3 or 5.0 via email, we will unfortunately not be able to accept it. If you have not already downloaded the 5.0 SDK in Visual Studio, you can do so here. For instructions on getting started with hosting, please click here.
If we are currently hosting an agent of yours that was developed in 4.3, please continue to send us updates in 4.3 via email. We will let you know when we are ready to migrate your project from 4.3 to 5.0.
Thanks, Windows Live Agents Team June 23 Messenger Screen Name Parameters UpdaterQuite often we hear from partners who complain that the screen name, personal status message, or display picture of their agent has disappeared from the Messenger client, and ask us to fix it. Strictly speaking, the display of these parameters in the client is subject to forces outside the control of the Windows Live Agents group, however we continue to work with those in charge of the Messenger network to resolve issues like this.
In the meantime, here is the code we use to "fix" the problem. If you are not already doing so, you should be using Method 3 from this blog post about changing friendly names, icons, and status messages. Updates to these Messenger parameters via edits to the BFG are no longer supported after 4.3!
To add the screen name parametes updater, simply include these lines in your project:
call WLMSetUpScreenNameParameters()
June 02 Scripting tips: conversations 4/4 Using actionsGreetings,
We often design dialogs in BuddyScript to answer to user queries in a specific context. We will see today how by using actions, we can have initiate and notify to answer dialogs, and what can be the use. A dialog is a set of dialog entries that leaves in your script. A dialog entry is some matching information and a script block associated with. The matching information part usually contains patterns, but can contain actions. An action has a name and potentially parameters. The first example is a trivia with timer. // Trivia with timer
? Play trivia - Let's play trivia in what year was the Eiffel Tower built ? in 1889 {action=Right()} in 1943 {action=Wrong()} in 2001 {action=Wrong()} You have 30 seconds to answer NID = notify in 30 seconds: action TimeOut() ? 1889 action Right() - That is correct! ? 1943 ? 2001 action Wrong() - Nope, the correct answer is 1889. action TimeOut() - Too late, the correct answer is 1889.
At first, you can see a common use of actions, and that is with an enumeration. Typing 1 will trigger the action Right, typing 2 or 3 will trigger the action Wrong. Or you can fully type the date and match one of the three patterns. Then, you will notice the action in the notify. After 30 seconds, the notification will answer the dialog for you. If you do respond before the notification triggers, it would be wise to cancel the notification, but if you don’t, that’s ok because when the notification will trigger, no dialog with a TimeOut action will be active and the notification will simply be ignored. Initiates can also use actions, but through the respond statement. The syntax is: RID = respond SOURCE: action MyAction(PARAMS) The SOURCE is an object that contains the user identity (screenname, service and UID), the buddy id and the conversation id to send the action to. Most of the time, you are coming from an initiate and you simply pass the object from SYS.ConversationSource. Don’t forget to save that object in a variable if you have a dialog, because this variable won’t survive the dialog. A typical use for this statement is to get information from another user’s profile. In the following example, I have altered the “Do you know my friend” example to display a friendly name that the friend may have provided. // Do you know my friend(oh you mean Jacky ?)
stored variable PREFERRED_NAME = ""
? Call me NAME=AnythingRaw PREFERRED_NAME = NAME - Ok, for now on, I will call you PREFERRED_NAME
procedure GetPreferredName() if PREFERRED_NAME = "" NAME = SYS.User.ScreenName else NAME = PREFERRED_NAME CID = respond SYS.ConversationSource: action SetName(NAME)
? Do you know EMAIL=AnEMailAddress ? RESULT_NOLOAD = initiate EMAIL, "MSN": GetPreferredName() {createprofile=false loadprofile=false} if RESULT_NOLOAD.Delivered IN_SESSION = true else IN_SESSION = false RESULT_LOAD = initiate EMAIL, "MSN": GetPreferredName() {createprofile=false loadprofile=true} if !RESULT_LOAD.Delivered - Sorry, I can't say that I do exit action SetName(NAME) if NAME != EMAIL - Oh, you mean NAME ? \c - Yes, I know that guy\c if IN_SESSION - , and he/she's talking to me right now. else
|