Identifying trannies with AI -

Kosher Dill

Potato Chips
True & Honest Fan
kiwifarms.net
I could download the videos and run facial recognition, but that would be a lot of bandwidth and space for a one-off project.
Get some Amazon MTurk workers to do it for a few pennies. They'll appreciate the diversion.
... Well, at least the ones working on your female dataset will.
 

The Reaper

Be more kind, my friends.
True & Honest Fan
kiwifarms.net
View attachment 1362991
>twice as many views for the "better traps" folder

Alright, you guys have got some 'splaining to do.
To be fair the girls.zip is nearly triple the download, so most guys are likely going for the quickets download that takes up the least amount of space.
The six Kiwis in the girls folder are getting a lot more bang for their buck right now.
 

PowerWomon

Kiwi Farm's Most Hated Feminist
kiwifarms.net
4: FaceNet states explicitly in the whitepaper that this is not a problem. Human faces are not symmetrical, but the included test suite tests for tilting up to 10deg etc.
I probably worded it badly, but I think you misunderstand. It's not so much about whether your library handles it - if it didn't, it would be pretty poor. If you are experiencing issues with class imbalance and inaccuracy, this might help you especially.

For example, if you had an image of the same person in different lighting conditions, with different croppings, you would expect it to be categorized the same way. That way, the machine learning algorithm has an easier time honing in on which features you want it to pay attention to. What you are practically doing is showing the algorithm different examples of images, but it does not know which aspects of the image it is supposed to care about, and it figures that out by itself. But just as the example of the images of ducks and cats being categorized on irrelevant features, using data augmentation might help you help the algorithm out on telling it what to pay attention to. If you have a dark and a light image of a woman, and they are both classified as women, the algorithm learns that it is not the lighting that is decisive. For a similar reason dropout layers are advantageous because a machine learning algorithm will likely use the easiest to detect features to classify your image and having random dropouts will take away some of these features from time to time.


Hinton, Geoffrey E., Nitish Srivastava, Alex Krizhevsky, Ilya Sutskever, and Ruslan R. Salakhutdinov. "Improving neural networks by preventing co-adaptation of feature detectors." arXiv preprint arXiv:1207.0580 (2012).


Why should you care about data augmentation?
Recent advances in deep learning models have been largely attributed to the quantity and diversity of data gathered in recent years. Data augmentation is a strategy that enables practitioners to significantly increase the diversity of data available for training models, without actually collecting new data. Data augmentation techniques such as cropping, padding, and horizontal flipping are commonly used to train large neural networks. However, most approaches used in training neural networks only use basic types of augmentation. While neural network architectures have been investigated in depth, less focus has been put into discovering strong types of data augmentation and data augmentation policies that capture data invariances.
It seems to me that you should try an easier version of the problem first rather than going right to identifying 40-pixel-wide faces from porn stills. Why not use a dataset of higher-res images cropped to just the face looking directly at the camera? Get a proof of concept done first before moving on to the harder tasks.
The test images provided with the library you're using are much higher-res than yours, they may not even have tested it on low-res images.

EDIT: and if you do get something working, then you can progressively lower the input images' resolution and see how strong "beer goggles" you have to apply to your algorithm to fool it :lol:
But that part is practically already being done for him, using that library. Most machine learning networks use very low resolution representations, such as 256x256 for AlexNet. We are not at the point where we can use a very high resolution version and the number of features we would have to learn would explode computation time.

One one hand, it is amazing from how little data neural networks can categorize images. On the other hand, it can also mean that neural networks are tricked on very, very tiny features, sometimes as small as a single pixel, on a neural network that is otherwise getting super-human accuracy on images.

adversarial-panda.jpg

Szegedy, Christian, et al. "Intriguing properties of neural networks." arXiv preprint arXiv:1312.6199 (2013)
Goodfellow, Ian J., Jonathon Shlens, and Christian Szegedy. "Explaining and harnessing adversarial examples." arXiv preprint arXiv:1412.6572 (2014).
Su, Jiawei, Danilo Vasconcellos Vargas, and Kouichi Sakurai. "One pixel attack for fooling deep neural networks." IEEE Transactions on Evolutionary Computation (2019).
Adversarial-attacks-0.jpg


It is part of why I was asking him to do some basic sanity checks. In this case, he is already getting back a set of feature vectors, pre-digested, so it might be a bit tough to map it back to the picture. That is part of the "interpretability challenge" that exists in machine learning. It can be very hard for a human to tell what a neural network is doing.

However, there are some ways for a fully connected and a convolutional network to visualize what portions of a neural network experience activation for a given image. It's still very hard to debug what a neural network is doing wrong, though, and then nearly impossible to adjust training based on that. At least, I can only think of very ideal situations where I've been able to do that.

That is why it might be a good idea to look at the confusion matrix and also see which images are misclassified. Do these individuals just particularly well or is there a pattern outside of the features that are supposed to be recognized responsible for the misclassification?
 

hundredpercent

kiwifarms.net
I probably worded it badly, but I think you misunderstand. It's not so much about whether your library handles it - if it didn't, it would be pretty poor. If you are experiencing issues with class imbalance and inaccuracy, this might help you especially.

For example, if you had an image of the same person in different lighting conditions, with different croppings, you would expect it to be categorized the same way. That way, the machine learning algorithm has an easier time honing in on which features you want it to pay attention to. What you are practically doing is showing the algorithm different examples of images, but it does not know which aspects of the image it is supposed to care about, and it figures that out by itself. But just as the example of the images of ducks and cats being categorized on irrelevant features, using data augmentation might help you help the algorithm out on telling it what to pay attention to. If you have a dark and a light image of a woman, and they are both classified as women, the algorithm learns that it is not the lighting that is decisive. For a similar reason dropout layers are advantageous because a machine learning algorithm will likely use the easiest to detect features to classify your image and having random dropouts will take away some of these features from time to time.


Hinton, Geoffrey E., Nitish Srivastava, Alex Krizhevsky, Ilya Sutskever, and Ruslan R. Salakhutdinov. "Improving neural networks by preventing co-adaptation of feature detectors." arXiv preprint arXiv:1207.0580 (2012).


Why should you care about data augmentation?
I already use FaceNet (https://arxiv.org/pdf/1503.03832.pdf) to get the embedding of the face, and that should be very resistant to such changes.

But that part is practically already being done for him, using that library. Most machine learning networks use very low resolution representations, such as 256x256 for AlexNet. We are not at the point where we can use a very high resolution version and the number of features we would have to learn would explode computation time.

One one hand, it is amazing from how little data neural networks can categorize images. On the other hand, it can also mean that neural networks are tricked on very, very tiny features, sometimes as small as a single pixel, on a neural network that is otherwise getting super-human accuracy on images.

View attachment 1363564
Szegedy, Christian, et al. "Intriguing properties of neural networks." arXiv preprint arXiv:1312.6199 (2013)
Goodfellow, Ian J., Jonathon Shlens, and Christian Szegedy. "Explaining and harnessing adversarial examples." arXiv preprint arXiv:1412.6572 (2014).
Su, Jiawei, Danilo Vasconcellos Vargas, and Kouichi Sakurai. "One pixel attack for fooling deep neural networks." IEEE Transactions on Evolutionary Computation (2019).
View attachment 1363580

It is part of why I was asking him to do some basic sanity checks. In this case, he is already getting back a set of feature vectors, pre-digested, so it might be a bit tough to map it back to the picture. That is part of the "interpretability challenge" that exists in machine learning. It can be very hard for a human to tell what a neural network is doing.

However, there are some ways for a fully connected and a convolutional network to visualize what portions of a neural network experience activation for a given image. It's still very hard to debug what a neural network is doing wrong, though, and then nearly impossible to adjust training based on that. At least, I can only think of very ideal situations where I've been able to do that.

That is why it might be a good idea to look at the confusion matrix and also see which images are misclassified. Do these individuals just particularly well or is there a pattern outside of the features that are supposed to be recognized responsible for the misclassification?
I think there is some more fundamental issue here. Is there any reason why the false negative rate would be so extremely high? It's doesn't seem to be a class imbalance thing, since the tranny pics get more weight.

I would train a neural network on the output of FaceNet, but the data set is pretty small.
 
  • Like
Reactions: PowerWomon

PowerWomon

Kiwi Farm's Most Hated Feminist
kiwifarms.net
I already use FaceNet (https://arxiv.org/pdf/1503.03832.pdf) to get the embedding of the face, and that should be very resistant to such changes.
Ouch, yeah I just thought about that, right after I hit reply. I'm sorry for the confusion. If you are getting exactly the same feature vectures out after image augmentation, which should be the case with a decent facial recognition library, then that's redundant nonsense, of course. If the cosine-distance, or whatever other metric they use, is practically zero between any two (augmented) images of the same person, then this is of no use.

I think there is some more fundamental issue here. Is there any reason why the false negative rate would be so extremely high? It's doesn't seem to be a class imbalance thing, since the tranny pics get more weight.
Well, yes. There is a reason, of course. The simplest reason might be that the data simply do not deliver what you are looking for. For example, if there is not sufficient system and the markers are too random, it won't be able to classify it correctly, just as if you tried to classify the brand of a car based on the paint job alone.

How much can you turn the knobs on the hyperparameters? I will try to take the time and run your code and see if I can get any better results, plus read the scikit documentation.

It might also be helpful to make it a little easier. Use nearly balanced datasets, for troubleshooting, that contain feminine women vs. your dataset of non-passing transgender people. It might also be of significance which race they are, so maybe try to run a test scenario with only caucasian people. If your dataset were imbalanced and you only had black transgender women and only white women, it would probably learn to distinguish caucasian from non-caucasian facial features.

It might very well be that your passing transgender people really do pass well enough that it becomes hard to distinguish them.
I would train a neural network on the output of FaceNet, but the data set is pretty small.
Do you mean small in terms of the number of features? That should be fine, even if it's very pre-filtered data. There might be very few features, but if the facial features it does give you can be used in any way to classify the images, that should be enough nonetheless.
 

wtfNeedSignUp

kiwifarms.net
I don't really follow the argument here but the problem is pretty simple:
1. Use some basic method to find location of face in image.
2. Cut a square around the face and change its size to be uniform (32x32 seems okay).
3. Use a basic bitch neural network (CIFAR) with a sigmoid at its end as a classifier of true false. If the size of the input is very small (32x32) then you don't need a large dataset, at worst augment the images by changing the lighting + change the angle of the image. You can also change the network with fancy stuff like residual blocks.
 
  • Like
Reactions: PowerWomon

PowerWomon

Kiwi Farm's Most Hated Feminist
kiwifarms.net
Thank you for the input. Maybe I am really not understanding the problem very well, but here are my thoughts no your suggestions:
1. Use some basic method to find location of face in image.
I think that is already done at this point, by Seeta. That library is, in his current approach, responsible for the embeddings he is using. That is already done at the point of the code he is showing, because he appears to be loading the embeddings and classifcations from an .npy file (AKA "numpy dumps" or whatever you like to call them). He is loading two of them at the begining of the code he posted.

But, I do not think the suggestion is completely beside the point. I had already suggested some sanity checks. Seeta is giving you two ouputs, from what I see in the documentation: the locations in the image where it detects the face and the facial features. If the face is correctly located, which he can varify with a few smoke tests, then this part should be fine. I had also suggest using different pictures of the same person to see that the facial recognition really is working well. If he is getting a feature vector back that can be realibly used to match a person (see CalcSimilarity) then that feature vector should also be good enough to check if a person is transgender, male or female.
2. Cut a square around the face and change its size to be uniform (32x32 seems okay).
If he already has the embeddings then that is not needed anymore, IMHO. Those already represent the face. That seems to be the 128 element feature vector (or however large it happens to be) that comes out of the library.
Screen Shot 2020-06-09 at 9.28.38 PM.png


3. Use a basic bitch neural network (CIFAR) with a sigmoid at its end as a classifier of true false. If the size of the input is very small (32x32) then you don't need a large dataset, at worst augment the images by changing the lighting + change the angle of the image. You can also change the network with fancy stuff like residual blocks.
Sigmoid at its end? I'm not sure what you mean. Sigmoids are usually used as activation functions. Maybe that's a neural network architecture I am not familiar with. The function you have at the end is usualy a softmax/cross entropy for classification.

Screen Shot 2020-06-09 at 9.31.50 PM.png

A sigmoid function at a fully connected final layer probably does not make sense for a classification, as you would get something like an activation threshold back, instead of something like a probability distribution over your classes you want to categorize your samples into.

at worst augment the images by changing the lighting + change the angle of the image.
I think this has been covered. I had made the same error in my thinking. If the lighting has no impact on the similarity score one gets in the embedding, then it makes no sense to use data augmentation. The training samples would be pretty much redundant. However, this, too, might be a good smoke test to see that the library really is correctly used and does deliver what it should deliver and return landmarks of the facial features that identify a person. If dim and bright lighting make substantial difference for the metric, though, you have a problem, yes.

You can also change the network with fancy stuff like residual blocks.
Yes, risidual blocks have been a neat and fancy feature to enhance image recognition accuracy, but I am not sure how much it would help here. It's not so easy to change the architecture after the fact, after a neural network has been trained. I have never seen skip connections added after the fact. It is usually also only of use in very deep neural networks. This neural network is already been trained and the few layers that would probably be needed to categorize the facial features would probably not be deeper than a few fully connected layers.
 

wtfNeedSignUp

kiwifarms.net
Thank you for the input. Maybe I am really not understanding the problem very well, but here are my thoughts no your suggestions:

I think that is already done at this point, by Seeta. That library is, in his current approach, responsible for the embeddings he is using. That is already done at the point of the code he is showing, because he appears to be loading the embeddings and classifcations from an .npy file (AKA "numpy dumps" or whatever you like to call them). He is loading two of them at the begining of the code he posted.

But, I do not think the suggestion is completely beside the point. I had already suggested some sanity checks. Seeta is giving you two ouputs, from what I see in the documentation: the locations in the image where it detects the face and the facial features. If the face is correctly located, which he can varify with a few smoke tests, then this part should be fine. I had also suggest using different pictures of the same person to see that the facial recognition really is working well. If he is getting a feature vector back that can be realibly used to match a person (see CalcSimilarity) then that feature vector should also be good enough to check if a person is transgender, male or female.

If he already has the embeddings then that is not needed anymore, IMHO. Those already represent the face. That seems to be the 128 element feature vector (or however large it happens to be) that comes out of the library.
View attachment 1363758
The problem is that you are assuming that the vector will corrospond to "tranny" features, which isn't necessarily the case. You can check if those features are good by just creating a very small network that gets the features as an input (pre-process the images through the first network and save their output to save time) and returns a classification.
Sigmoid at its end? I'm not sure what you mean. Sigmoids are usually used as activation functions. Maybe that's a neural network architecture I am not familiar with. The function you have at the end is usualy a softmax/cross entropy for classification.

View attachment 1363743
A sigmoid function at a fully connected final layer probably does not make sense for a classification, as you would get something like an activation threshold back, instead of something like a probability distribution over your classes you want to categorize your samples into.
Sigmoid gives you a 0 to 1 range value. In classification terms >0.5 is true and <0.5 is false. Because you have only a single output node instead of 2 for softmax it gives the network a far easier time since it needs to backpropogate a single value rather than do a balancing act for two.
I think this has been covered. I had made the same error in my thinking. If the lighting has no impact on the similarity score one gets in the embedding, then it makes no sense to use data augmentation. The training samples would be pretty much redundant. However, this, too, might be a good smoke test to see that the library really is correctly used and does deliver what it should deliver and return landmarks of the facial features that identify a person. If dim and bright lighting make substantial difference for the metric, though, you have a problem, yes.
If you use the embedding there is no point in data augmentation since you can be sure the original framework was already trained on far more lighting examples. If you use the method of making a new small network for faces then you should augment and see if it makes a difference.
Yes, risidual blocks have been a neat and fancy feature to enhance image recognition accuracy, but I am not sure how much it would help here. It's not so easy to change the architecture after the fact, after a neural network has been trained. I have never seen skip connections added after the fact. It is usually also only of use in very deep neural networks. This neural network is already been trained and the few layers that would probably be needed to categorize the facial features would probably not be deeper than a few fully connected layers.
There is no problem in putting them in regardless on either of the networks I proposed since they at worst make the convergence faster without improvement in accuracy.
 
  • Like
Reactions: PowerWomon

PowerWomon

Kiwi Farm's Most Hated Feminist
kiwifarms.net
The problem is that you are assuming that the vector will corrospond to "tranny" features, which isn't necessarily the case. You can check if those features are good by just creating a very small network that gets the features as an input (pre-process the images through the first network and save their output to save time) and returns a classification.
That is a good question, the central question even, probably. I think it will correspond to or contain such features because if the features can be reliably used to identify a person, in various pictures, then it should also be good enough to identify transgender facial features. In the conception of what it means to be transgender that you would like to teach to the machin learning algorithm, a person cannot look transgender and not transgender at the same time. Then, I think, it would also not be able to match the person reliably, and the reliably of the method they use appears very high.
Screen Shot 2020-06-09 at 10.26.28 PM.png

I mean, damn ... even I have some trouble telling if this is the same person or not in some of those images. So, this library, if it really delivers what it says the underlying neural network delivers, appears pretty robust against a variety of image manipulations. Still, I would do a few image flips, or some other simple manipulations. If the similarity metric you can get out of the library is good enough to reliably tell apart two people, then I think it should also be good enough to extract what makes a face look transgender. I hope my logic is sound on this, but maybe I am mistaken and I am being really dense right now. The only other possibility would be, perhaps, that transgender people surprisingly mess up the basic identification of a face somehow and transgender people cannot be told apart, through this method, from cis-gender people people that look different to us but are classified as being "the same person" (i.e. get a similarity score that is very high).

Again, the gist of my logic is: The same person cannot look both transgender and not transgender, or it's not the same person, or you have not reliably learned what it means to "look transgender". That is a contradiction with the image library reliably identifying people.

Sigmoid gives you a 0 to 1 range value. In classification terms >0.5 is true and <0.5 is false. Because you have only a single output node instead of 2 for softmax it gives the network a far easier time since it needs to backpropogate a single value rather than do a balancing act for two.
Yes, typically, although that's not always the case. You also have layer and node weights after that. You can normalize those, of course, but then you are practically just doing what a softmax function would have done, i.e. scale the values in such a manner that they add to one, as in a probability distribution. Either way, if we mean the same thing, and we are on the same page here, I do not really care how he gets that distribution. Which activation function he uses is probably a minor detail at this point.

If you use the embedding there is no point in data augmentation since you can be sure the original framework was already trained on far more lighting examples. If you use the method of making a new small network for faces then you should augment and see if it makes a difference.
Right. I think we are on the same page now. That is what I meant. The facial features, i.e. the embeddings that are currently loaded from the numpy files (.npy), should already be robust against that. Do some basic smoke tests that they really are, and that should be fine.

The new neural network I was talking about were the extra layers that could be added through transfer learning. In Keras, for example, one would remove the last few layers from a neural network (as those are generally the most specific), mark the remaining weights as fixed, and only train the weights of the layer one has just added. That way, for example, a neural network like AlexNet that is already trained to recognize crows, but cannot distinguish them from ravens perhaps, can be trained to further discriminate between those two classes of birds, and then one adds the new softmax layer/binary cross entropy layer with the N+1 classes at the end, so that it know also knows about ravens, in addition to crows (which is also why it does not generally make sense to use a sigmoid activation in the final layer for a classification problem).

There is no problem in putting them in regardless on either of the networks I proposed since they at worst make the convergence faster without improvement in accuracy.
No, probably not. I just thought it would add too much complexity and additional variables.

Either way, the approach seems basically sound, to me at least. If the facial features are good enough to recognize the same person this reliably, from their face, then it should also be usable to extract transgender status somehow. The approach makes perfect sense to me. The embeddings that he appears to have generated already should be all that is needed and he is using a support vector machine to do a (non-linear) classification on those feature vectors.

I would play with the hyperparameters of the classification algorithm you are using a bit (SVM). Sorry, but even after all this time I still cannot give you anything better than a rule of thumb how to choose some of those hyperparameters. I think that is still a common criticism in most fields of machine learning. I see that you are already defining a Python list and are naming the various configurations for the classifiers, which is a thing I like doing as well. Try a few different classifiers and twiddle with their settings a bit, then see what accuracy you can get. Sorry for this trial and error approach.

The default appears to be an RBMF kernel. The scikit documentation suggest grid search, which, to me, sorry to be cynical somehwhat, is little more than fancy trial and error.
Screen Shot 2020-06-09 at 10.48.56 PM.png


If everything else, i.e. the facial identification/recognition library, is basically working, you are really only left with tweaking the scikit classifier and its hyperparameters.

But, as always, perhaps I had a brainfart somewhere. I am very smol brain, so if I made an error in logic somewhere, do not feel shy to point it out. I will neg-rate your comment and seethe, but I promise to admit a mistake where I am wrong.
 

PowerWomon

Kiwi Farm's Most Hated Feminist
kiwifarms.net
If they are passing and minding their own business why do you care?
Excuse me, sir, but this is for SCIENCE!
portal-2-portal-glados-aperture-laboratories-game-aperture-science-innovators-wallpaper-preview.jpg

It's also a great excuse to download tons of trap porn and sperg about machine learning at the same time.

We do what we must because we can . . .
 

Exterminate Leftists

Who needs nuance when you have a helicopter?
kiwifarms.net
I don't need an AI for that.


These are excellent posts, thank you!

2: That's right. SeetaFace has extremely high dimensionality and is quite sparse. Since FaceNet has better accuracy, it seems like it would be strictly better to use. It also has nicer tools for cleaning the data.

1, 3: Good idea. I don't have a dataset of male faces, but I'll try to find a comparable one. I would think that above 90% is possible, intuitively.

4: FaceNet states explicitly in the whitepaper that this is not a problem. Human faces are not symmetrical, but the included test suite tests for tilting up to 10deg etc.

5: Even when explicitly set to balance the classes, there's horrible class imbalance problems. The confusion matrix looks like this:
Code:
[[0.90261628 0.09738372]
[0.47468354 0.52531646]]
What's happening? When I print the raw decisions, they're pretty much all either -1 or 1. This doesn't change with probability=True either.

8: I'm using scikit, that has the same feature.

I'm going to try a neural network next, but please tell if there's anything wrong in this code that might cause a class imbalance problem:
Code:
import numpy as np
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix
from sklearn.decomposition import PCA
from sklearn.utils import shuffle

X = np.load('embeddings.npy')
ls = np.load('label_strings.npy')
y = []
for el in ls:
    y.append(0 if el == 'girls' else 1)

X, y = shuffle(X, y)

classifiers = [SVC(gamma=2, C=1, class_weight='balanced')]
names = ["RBF SVM"]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.4, random_state=42)

for name, clf in zip(names, classifiers):
    clf.fit(X_train, y_train)
    score = clf.score(X_test, y_test)
    print(score)
    y_pred = clf.predict(X_test)
    print(confusion_matrix(y_test, y_pred, normalize='true'))
Where are you suggesting I find a dataset of high-res traps cropped perfectly?
If you have a big collection, please post it. This is just what I was able to scrape from PornHub.
you may be able to cross reference known trannies with their Instagram posts. This is the next generation of thot patrolling. Patrolling dudes.
 

hundredpercent

kiwifarms.net
Facial feminization surgeons often have fair-quality face-on "after" pictures (example).
Quantity > quality. If you have a big database of pictures like that, it could be used however.

Well, yes. There is a reason, of course. The simplest reason might be that the data simply do not deliver what you are looking for. For example, if there is not sufficient system and the markers are too random, it won't be able to classify it correctly, just as if you tried to classify the brand of a car based on the paint job alone.

How much can you turn the knobs on the hyperparameters? I will try to take the time and run your code and see if I can get any better results, plus read the scikit documentation.

It might also be helpful to make it a little easier. Use nearly balanced datasets, for troubleshooting, that contain feminine women vs. your dataset of non-passing transgender people. It might also be of significance which race they are, so maybe try to run a test scenario with only caucasian people. If your dataset were imbalanced and you only had black transgender women and only white women, it would probably learn to distinguish caucasian from non-caucasian facial features.

It might very well be that your passing transgender people really do pass well enough that it becomes hard to distinguish them.
If it's just inaccuracy, the confusion matrix should look something like this:
Code:
[[0.5, 0.5],
 [0.5, 0.5]]
This wouldn't be imbalanced. Downsampling worked fine. Accuracy wasn't great, but no class imbalance problems:
Code:
import numpy as np
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix
from sklearn.decomposition import PCA
from sklearn.utils import shuffle

X = np.load('embeddings.npy')
ls = np.load('label_strings.npy')
y = []
for el in ls:
    y.append(0 if el == 'girls' else 1)

y = np.array(y)
X, y = shuffle(X, y)

y_bool = np.where(y, 1, 0)

# Indicies of each class' observations
i_class0 = np.where(y_bool == 0)[0]
i_class1 = np.where(y_bool == 1)[0]

# Number of observations in each class
n_class0 = len(i_class0)
n_class1 = len(i_class1)

# For every observation of class 0, randomly sample from class 1 without replacement
i_class0_downsampled = np.random.choice(i_class0, size=n_class1, replace=False)

# Join together class 0's target vector with the downsampled class 1's target vector
y = np.hstack((y[i_class0_downsampled], y[i_class1]))
X = np.vstack((X[i_class0_downsampled], X[i_class1]))



classifiers = [SVC(gamma=2, C=1, class_weight='balanced')]
names = ["RBF SVM"]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.4, random_state=42)

for name, clf in zip(names, classifiers):
    clf.fit(X_train, y_train)
    score = clf.score(X_test, y_test)
    print(score)
    y_pred = clf.predict(X_test)
    print(confusion_matrix(y_test, y_pred, normalize='true'))
Output:
Code:
0.75625
[[0.76219512 0.23780488]
 [0.25       0.75      ]]
So, unless I'm misunderstanding this, class_weight='balanced' just doesn't work.

In other words, I'll have to tune the hyperparameters. Bayesian optimization should work fine for this. If that won't work, I'll have to train a neural network...

Do you mean small in terms of the number of features? That should be fine, even if it's very pre-filtered data. There might be very few features, but if the facial features it does give you can be used in any way to classify the images, that should be enough nonetheless.
No, in the number of samples. 350 traps gives 700 samples with downsampling to 50/50. Can you really train a neural network with just 700 samples? If not, I'll have to collect more data. Eww, gross.

I think that is already done at this point, by Seeta. That library is, in his current approach, responsible for the embeddings he is using. That is already done at the point of the code he is showing, because he appears to be loading the embeddings and classifcations from an .npy file (AKA "numpy dumps" or whatever you like to call them). He is loading two of them at the begining of the code he posted.
Not using Seeta anymore, only FaceNet and the included MTCNN face detector.

I would play with the hyperparameters of the classification algorithm you are using a bit (SVM). Sorry, but even after all this time I still cannot give you anything better than a rule of thumb how to choose some of those hyperparameters. I think that is still a common criticism in most fields of machine learning. I see that you are already defining a Python list and are naming the various configurations for the classifiers, which is a thing I like doing as well. Try a few different classifiers and twiddle with their settings a bit, then see what accuracy you can get. Sorry for this trial and error approach.

The default appears to be an RBMF kernel. The scikit documentation suggest grid search, which, to me, sorry to be cynical somehwhat, is little more than fancy trial and error.
Screen Shot 2020-06-09 at 10.48.56 PM.png

https://scikit-learn.org/stable/tutorial/basic/tutorial.html
If everything else, i.e. the facial identification/recognition library, is basically working, you are really only left with tweaking the scikit classifier and its hyperparameters.

But, as always, perhaps I had a brainfart somewhere. I am very smol brain, so if I made an error in logic somewhere, do not feel shy to point it out. I will neg-rate your comment and seethe, but I promise to admit a mistake where I am wrong.
I'll just downsample instead. Does this hurt accuracy, since I'm throwing away data?
 

wtfNeedSignUp

kiwifarms.net
That is a good question, the central question even, probably. I think it will correspond to or contain such features because if the features can be reliably used to identify a person, in various pictures, then it should also be good enough to identify transgender facial features. In the conception of what it means to be transgender that you would like to teach to the machin learning algorithm, a person cannot look transgender and not transgender at the same time. Then, I think, it would also not be able to match the person reliably, and the reliably of the method they use appears very high.
View attachment 1363863
I mean, damn ... even I have some trouble telling if this is the same person or not in some of those images. So, this library, if it really delivers what it says the underlying neural network delivers, appears pretty robust against a variety of image manipulations. Still, I would do a few image flips, or some other simple manipulations. If the similarity metric you can get out of the library is good enough to reliably tell apart two people, then I think it should also be good enough to extract what makes a face look transgender. I hope my logic is sound on this, but maybe I am mistaken and I am being really dense right now. The only other possibility would be, perhaps, that transgender people surprisingly mess up the basic identification of a face somehow and transgender people cannot be told apart, through this method, from cis-gender people people that look different to us but are classified as being "the same person" (i.e. get a similarity score that is very high).

Again, the gist of my logic is: The same person cannot look both transgender and not transgender, or it's not the same person, or you have not reliably learned what it means to "look transgender". That is a contradiction with the image library reliably identifying people.
It depends, knowing to differenciate faces might focus on traits in faces that aren't useful to find gender but rather identity.
Yes, typically, although that's not always the case. You also have layer and node weights after that. You can normalize those, of course, but then you are practically just doing what a softmax function would have done, i.e. scale the values in such a manner that they add to one, as in a probability distribution. Either way, if we mean the same thing, and we are on the same page here, I do not really care how he gets that distribution. Which activation function he uses is probably a minor detail at this point.
It is still far better in practice.
Right. I think we are on the same page now. That is what I meant. The facial features, i.e. the embeddings that are currently loaded from the numpy files (.npy), should already be robust against that. Do some basic smoke tests that they really are, and that should be fine.

The new neural network I was talking about were the extra layers that could be added through transfer learning. In Keras, for example, one would remove the last few layers from a neural network (as those are generally the most specific), mark the remaining weights as fixed, and only train the weights of the layer one has just added. That way, for example, a neural network like AlexNet that is already trained to recognize crows, but cannot distinguish them from ravens perhaps, can be trained to further discriminate between those two classes of birds, and then one adds the new softmax layer/binary cross entropy layer with the N+1 classes at the end, so that it know also knows about ravens, in addition to crows (which is also why it does not generally make sense to use a sigmoid activation in the final layer for a classification problem).
Of course for cases with more than one class then a softmax is a must. Plus, you should experiment in different cases of transfer learning if you have the gpu to train more layers than just the new ones.
No, probably not. I just thought it would add too much complexity and additional variables.

Either way, the approach seems basically sound, to me at least. If the facial features are good enough to recognize the same person this reliably, from their face, then it should also be usable to extract transgender status somehow. The approach makes perfect sense to me. The embeddings that he appears to have generated already should be all that is needed and he is using a support vector machine to do a (non-linear) classification on those feature vectors.

I would play with the hyperparameters of the classification algorithm you are using a bit (SVM). Sorry, but even after all this time I still cannot give you anything better than a rule of thumb how to choose some of those hyperparameters. I think that is still a common criticism in most fields of machine learning. I see that you are already defining a Python list and are naming the various configurations for the classifiers, which is a thing I like doing as well. Try a few different classifiers and twiddle with their settings a bit, then see what accuracy you can get. Sorry for this trial and error approach.

The default appears to be an RBMF kernel. The scikit documentation suggest grid search, which, to me, sorry to be cynical somehwhat, is little more than fancy trial and error.
View attachment 1363884

If everything else, i.e. the facial identification/recognition library, is basically working, you are really only left with tweaking the scikit classifier and its hyperparameters.

But, as always, perhaps I had a brainfart somewhere. I am very smol brain, so if I made an error in logic somewhere, do not feel shy to point it out. I will neg-rate your comment and seethe, but I promise to admit a mistake where I am wrong.
You should definitely use a MLP rather than an SVM. The feature vectors are very likely to be too problematic to directly partition.
 

hundredpercent

kiwifarms.net
It depends, knowing to differenciate faces might focus on traits in faces that aren't useful to find gender but rather identity.

It is still far better in practice.

Of course for cases with more than one class then a softmax is a must. Plus, you should experiment in different cases of transfer learning if you have the gpu to train more layers than just the new ones.

You should definitely use a MLP rather than an SVM. The feature vectors are very likely to be too problematic to directly partition.
I did some more scraping, and now I have 7000+ trannies and 2000+ girls. How many do I need?

The accuracy is now at "82%," but potentially overfit. since there's duplicates in the tranny set. How to fix, cluster based on cosine similarity and pick a random item from each cluster?

The dataset is tainted. The fucking degenerates put men who fuck trannies in their "trap pictures," which severely harms its quality. How to fix, make another classifier and then manually evaluate the edge cases? Then I'll need a male dataset.

For obvious reasons, there's a lot of Asians in the trap dataset. How to fix? Train a separate ethnicity detector and use it to downsample? Will I need to collect more data or is there already a neat one sorted by race/ethnicity?

Did some ad-hoc testing and the results aren't great. Brianna Wu gets 76% cis, ContraPoints pre-transition gets 52% trans, Sanna Marin gets 95% cis, and ContraPoints post-transition gets 99.37 (!!!) cis.

In other words, it's basically just detecting masculinity. This could be because of the extreme outliers, though.

What do you recommend to make a neural network with a small dataset? Is it a good idea to run dimensionality reduction before input to NN?

EDIT: One idea I just got is to make a balanced (25/25/25/25) dataset of mtf trannies, cis women, ftm trannies, and cis men, and then see if it can classify by birth sex. But FtM trannies are rare, and they're not as common in porn, which makes things harder for me.
 

wtfNeedSignUp

kiwifarms.net
I did some more scraping, and now I have 7000+ trannies and 2000+ girls. How many do I need?

The accuracy is now at "82%," but potentially overfit. since there's duplicates in the tranny set. How to fix, cluster based on cosine similarity and pick a random item from each cluster?

The dataset is tainted. The fucking degenerates put men who fuck trannies in their "trap pictures," which severely harms its quality. How to fix, make another classifier and then manually evaluate the edge cases? Then I'll need a male dataset.

For obvious reasons, there's a lot of Asians in the trap dataset. How to fix? Train a separate ethnicity detector and use it to downsample? Will I need to collect more data or is there already a neat one sorted by race/ethnicity?

Did some ad-hoc testing and the results aren't great. Brianna Wu gets 76% cis, ContraPoints pre-transition gets 52% trans, Sanna Marin gets 95% cis, and ContraPoints post-transition gets 99.37 (!!!) cis.

In other words, it's basically just detecting masculinity. This could be because of the extreme outliers, though.

What do you recommend to make a neural network with a small dataset? Is it a good idea to run dimensionality reduction before input to NN?

EDIT: One idea I just got is to make a balanced (25/25/25/25) dataset of mtf trannies, cis women, ftm trannies, and cis men, and then see if it can classify by birth sex. But FtM trannies are rare, and they're not as common in porn, which makes things harder for me.
If you are using Facenet then it probably has a way to consider multiple faces in an image. Just pick the face with the highest likelihood. Regardless, unless the data is massively tainted then a classifier should still work despite some false positives. Ditto for Asians, unless they are 90% of the dataset, the network will find how to deal with them. If you have labels for ethinicity, it might help to add an additional race class output to the network that can boost performence.

Transfer learning should already work with a small dataset, especially when you have thousands of images. Dimensionality reduction won't help in any way. What is the input to your network? Maybe try to use something YOLO to cut a square around every person and input that to your network, to reduce background noise.
 

hundredpercent

kiwifarms.net
If you are using Facenet then it probably has a way to consider multiple faces in an image. Just pick the face with the highest likelihood. Regardless, unless the data is massively tainted then a classifier should still work despite some false positives. Ditto for Asians, unless they are 90% of the dataset, the network will find how to deal with them. If you have labels for ethinicity, it might help to add an additional race class output to the network that can boost performence.
Highest likelihood of what? They're both faces, but the images are labeled by class. If there's an image marked "trap" and it contains one trap and one man, then it's 50/50 which one gets marked trap and which one discarded.
Transfer learning should already work with a small dataset, especially when you have thousands of images. Dimensionality reduction won't help in any way. What is the input to your network? Maybe try to use something YOLO to cut a square around every person and input that to your network, to reduce background noise.
I am using sklearn's MLPClassifier, inputs are FaceNet's outputs. The accuracy is slightly better than SVM. Adding noise to the images won't change anything.
 
Tags
None